Explicit Sort Stage Vs TSort Operator ?
Posted: Thu Jan 17, 2013 6:22 pm
Hi
I am joining two data sets using join stage and both of them are hash partitioned on the join key but the data sets are not sorted. I believe parallel framework inserts the tsort operator if the data is not sorted.
I see in some of the posts that it's better to put the sort stage explicitly but am not sure about the reason. To me, explicit sort stage or tsort operator both going to sort in the same way. Correct me If I am wrong...
Thanks
I am joining two data sets using join stage and both of them are hash partitioned on the join key but the data sets are not sorted. I believe parallel framework inserts the tsort operator if the data is not sorted.
I see in some of the posts that it's better to put the sort stage explicitly but am not sure about the reason. To me, explicit sort stage or tsort operator both going to sort in the same way. Correct me If I am wrong...
Thanks