About tsort operator on sorted data
Posted: Wed Jan 22, 2014 1:39 am
If data have been sorted (Sort Stage) and then DataStage auto inserted tsort operator (Join Stage - same key as Sort Stage), is it actually re-sort data again?
I've created test job like this
Seq_1 ---> Sort_1 ---> Copy_1 ---> Join_Stage --->Copy_3
Seq_2 ---> Sort_2 ---> Copy_2 ---^
Some of job score:
My question: At run time will data be re-sort again or not?
I've created test job like this
Seq_1 ---> Sort_1 ---> Copy_1 ---> Join_Stage --->Copy_3
Seq_2 ---> Sort_2 ---> Copy_2 ---^
Some of job score:
Notict that there are some auto inserted tsort operator at Join Stage although data already sorted.main_program: This step has 7 datasets:
ds0: {op0[1p] (sequential Sequential_File_1)
eOther(APT_HashPartitioner { key={ value=KeyCol,
subArgs={ asc }
}
})<>eCollectAny
op2[2p] (parallel APT_CombinedOperatorController(0):Sort_1)}
ds1: {op1[1p] (sequential Sequential_File_2)
eOther(APT_HashPartitioner { key={ value=KeyCol,
subArgs={ asc }
}
})<>eCollectAny
op3[2p] (parallel APT_CombinedOperatorController(1):Sort_2)}
ds2: {op2[2p] (parallel APT_CombinedOperatorController(0):Copy_1)
eOther(APT_HashPartitioner { key={ value=KeyCol }
})#>eCollectAny
op4[2p] (parallel inserted tsort operator {key={value=KeyCol, subArgs={asc, cs}}}(0) in Join_Stage)}
ds3: {op3[2p] (parallel APT_CombinedOperatorController(1):Copy_2)
eOther(APT_HashPartitioner { key={ value=KeyCol }
})#>eCollectAny
op5[2p] (parallel inserted tsort operator {key={value=KeyCol, subArgs={asc, cs}}}(1) in Join_Stage)}
ds4: {op4[2p] (parallel inserted tsort operator {key={value=KeyCol, subArgs={asc, cs}}}(0) in Join_Stage)
[pp] eSame=>eCollectAny
op6[2p] (parallel APT_JoinSubOperatorNC in Join_Stage)}
ds5: {op5[2p] (parallel inserted tsort operator {key={value=KeyCol, subArgs={asc, cs}}}(1) in Join_Stage)
[pp] eSame=>eCollectAny
op6[2p] (parallel APT_JoinSubOperatorNC in Join_Stage)}
ds6: {op6[2p] (parallel APT_JoinSubOperatorNC in Join_Stage)
eAny=>eCollectAny
op7[2p] (parallel Copy_41)}
My question: At run time will data be re-sort again or not?