how to remove all duplicate records

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
shalini11
Participant
Posts: 74
Joined: Thu Jan 22, 2009 3:00 am

how to remove all duplicate records

Post by shalini11 »

Hi All,

If the scenario is to remove all the records having duplicate key,the remove duplicate stage or the sort stage removes all except one.But i want to remove all records. Can it be done in Datastage?

Ex:
Col1 Col2 Col3
1 001 002
1 001 003
2 001 004
3 002 003

In this example, Col1 and Col2 are the key for removing duplicates.
I want to remove both the records having duplicate values for these columns.

Can it be implemented in DS or I need to write shell script for the same?


Thanks in adv
Pagadrai
Participant
Posts: 111
Joined: Fri Dec 31, 2004 1:16 am
Location: Chennai

Re: how to remove all duplicate records

Post by Pagadrai »

shalini11 wrote: In this example, Col1 and Col2 are the key for removing duplicates.
I want to remove both the records having duplicate values for these columns.
Can it be implemented in DS or I need to write shell script for the same?
Hi,
This can be implemented in DataStage.
First you need to get the count by grouping the records based on Col1 and Col2. You can do this with aggregator. Obtain the values where Count > 1

Then use the above result as lookup.
If lookup matches, then reject those records from incoming source.

* You can implement this logic in single job or in 2 separate jobs.
shalini11
Participant
Posts: 74
Joined: Thu Jan 22, 2009 3:00 am

Post by shalini11 »

Thanks for the quick reply

I'll try this and get back to u if i face any difficulty in implementing
shalini11
Participant
Posts: 74
Joined: Thu Jan 22, 2009 3:00 am

Post by shalini11 »

Thanks one again for your help

I am able to remove all the duplicate records now with your suggestion
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

shalini11 wrote:Thanks for the quick reply

I'll try this and get back to u if i face any difficulty in implementing
Why do you believe that U specifically will be able to assist?

The second person personal pronoun in English is spelled "you", not "u". Please use a professional standard of written English on DSXchange. It makes life easier for those whose first language is not English.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply