Remove Duplicate PRoblem when using 2X

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
_chamak
Premium Member
Premium Member
Posts: 29
Joined: Tue Aug 24, 2010 10:29 am

Remove Duplicate PRoblem when using 2X

Post by _chamak »

i have a job which is currently runing fine with 1X but having problem then i change it into 2X , have a sort followed by remove duplicates. Data is hash partitioned in sort based on key columns the data type fr both of the is integer. The remove duplicates when using 1X is removing 148 rows but if i use 2X it only removed 147. Can any one help me with this
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard.

Are the data partitioned on the "key used to determine duplicates"?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
_chamak
Premium Member
Premium Member
Posts: 29
Joined: Tue Aug 24, 2010 10:29 am

Post by _chamak »

ray.wurlod wrote:Welcome aboard.

Are the data partitioned on the "key used to determine duplicates"?
i am partitioning on the key columns in the sort before remove duplicates.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Are the sort keys the same as the remove duplicate keys?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply