Regarding partioning

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
nveejas
Participant
Posts: 11
Joined: Sun Sep 26, 2010 11:42 pm
Location: illinois

Regarding partioning

Post by nveejas »

I have a doubt in partitioning. In my job i'm using a sort -> remove duplicate->join.

Here sort and RD is based on 3 keys (say key1, key2 and key 3) and join is based on 2 keys (key1 and key2).. so in this case should i need to re-partition(hash) in join stage based on these 2 keys?

Thes two join keys are already has partioned in sort stage.
Thanks,
Sajeev N
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

If your data is already partitioned on key1 and key2 prior to the sort/rd, there is no need to repartition for the join stage (the partitioning already meets the requirements for all logic. RD will not affect the existing partitioning...it is only removing records). If your data is partitioned on key1, key2 and key3 prior to the sort/rd, either remove key3 from the partition strategy (preferred) or repartition/resort prior to the join.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
npsandeep
Participant
Posts: 6
Joined: Wed Apr 25, 2012 6:39 am

Re: Regarding partioning

Post by npsandeep »

I go with Jwiles.
nveejas
Participant
Posts: 11
Joined: Sun Sep 26, 2010 11:42 pm
Location: illinois

Post by nveejas »

Hi Jwiles,

Thanks a lot.. Now its working fine.
Thanks,
Sajeev N
Post Reply