Page 1 of 1

Regarding partioning

Posted: Wed Apr 25, 2012 8:50 am
by nveejas
I have a doubt in partitioning. In my job i'm using a sort -> remove duplicate->join.

Here sort and RD is based on 3 keys (say key1, key2 and key 3) and join is based on 2 keys (key1 and key2).. so in this case should i need to re-partition(hash) in join stage based on these 2 keys?

Thes two join keys are already has partioned in sort stage.

Posted: Wed Apr 25, 2012 12:39 pm
by jwiles
If your data is already partitioned on key1 and key2 prior to the sort/rd, there is no need to repartition for the join stage (the partitioning already meets the requirements for all logic. RD will not affect the existing partitioning...it is only removing records). If your data is partitioned on key1, key2 and key3 prior to the sort/rd, either remove key3 from the partition strategy (preferred) or repartition/resort prior to the join.

Regards,

Re: Regarding partioning

Posted: Wed Apr 25, 2012 1:59 pm
by npsandeep
I go with Jwiles.

Posted: Thu Apr 26, 2012 12:54 am
by nveejas
Hi Jwiles,

Thanks a lot.. Now its working fine.