Issue with data stage Remove duplicator stage

das_nirmalya · Post by **das_nirmalya** » Tue Apr 16, 2013 1:58 am

Hi

We are using Remove duplicator stage to eliminate the duplicate contact information coming from source file.

We have used hash partition on contact_text to remove duplicate contact number.

Job is able to remove all the duplicate contact information except contct number 13902324265.

Can anybody tell me what could be the probable reason.

jerome_rajan · Post by **jerome_rajan** » Tue Apr 16, 2013 2:29 am

Are you sorting the data ?

ray.wurlod · Post by **ray.wurlod** » Tue Apr 16, 2013 3:23 am

The Remove Duplicates stage requires that data be sorted on the "key" column(s).

It is also possible that there is a non-printing character in one of the "rogue" values. Or even a non-visible character, such as a space.