Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.
Moderators: chulett , rschirm , roy
_chamak
Premium Member
Posts: 29 Joined: Tue Aug 24, 2010 10:29 am
Post
by _chamak » Mon Nov 15, 2010 12:52 pm
i have a job which is currently runing fine with 1X but having problem then i change it into 2X , have a sort followed by remove duplicates. Data is hash partitioned in sort based on key columns the data type fr both of the is integer. The remove duplicates when using 1X is removing 148 rows but if i use 2X it only removed 147. Can any one help me with this
ray.wurlod
Participant
Posts: 54607 Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:
Post
by ray.wurlod » Mon Nov 15, 2010 2:12 pm
Welcome aboard.
Are the data partitioned on the "key used to determine duplicates"?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
_chamak
Premium Member
Posts: 29 Joined: Tue Aug 24, 2010 10:29 am
Post
by _chamak » Mon Nov 15, 2010 4:00 pm
ray.wurlod wrote: Welcome aboard.
Are the data partitioned on the "key used to determine duplicates"?
i am partitioning on the key columns in the sort before remove duplicates.
ray.wurlod
Participant
Posts: 54607 Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:
Post
by ray.wurlod » Mon Nov 15, 2010 11:01 pm
Are the sort keys the same as the remove duplicate keys?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.