Issue with data stage Remove duplicator stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
das_nirmalya
Participant
Posts: 59
Joined: Thu Mar 20, 2008 12:11 am

Issue with data stage Remove duplicator stage

Post by das_nirmalya »

Hi

We are using Remove duplicator stage to eliminate the duplicate contact information coming from source file.

We have used hash partition on contact_text to remove duplicate contact number.

Job is able to remove all the duplicate contact information except contct number 13902324265.

Can anybody tell me what could be the probable reason.
nsd
jerome_rajan
Premium Member
Premium Member
Posts: 376
Joined: Sat Jan 07, 2012 12:25 pm
Location: Piscataway

Post by jerome_rajan »

Are you sorting the data ?
Jerome
Data Integration Consultant at AWS
Connect With Me On LinkedIn

Life is really simple, but we insist on making it complicated.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The Remove Duplicates stage requires that data be sorted on the "key" column(s).

It is also possible that there is a non-printing character in one of the "rogue" values. Or even a non-visible character, such as a space.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply