Regarding partioning

nveejas · Post by **nveejas** » Wed Apr 25, 2012 8:50 am

I have a doubt in partitioning. In my job i'm using a sort -> remove duplicate->join.

Here sort and RD is based on 3 keys (say key1, key2 and key 3) and join is based on 2 keys (key1 and key2).. so in this case should i need to re-partition(hash) in join stage based on these 2 keys?

Thes two join keys are already has partioned in sort stage.

jwiles · Post by **jwiles** » Wed Apr 25, 2012 12:39 pm

If your data is already partitioned on key1 and key2 prior to the sort/rd, there is no need to repartition for the join stage (the partitioning already meets the requirements for all logic. RD will not affect the existing partitioning...it is only removing records). If your data is partitioned on key1, key2 and key3 prior to the sort/rd, either remove key3 from the partition strategy (preferred) or repartition/resort prior to the join.

Regards,

npsandeep · Post by **npsandeep** » Wed Apr 25, 2012 1:59 pm

I go with Jwiles.

nveejas · Post by **nveejas** » Thu Apr 26, 2012 12:54 am

Hi Jwiles,

Thanks a lot.. Now its working fine.

DSXchange

Regarding partioning

Regarding partioning

Re: Regarding partioning