How to remove Duplicate Record ?

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Are you using a server job, a mainframe job or a parallel job?

In server and mainframe jobs, the easiest way is to have a sorted input stream and use a stage variable (in a Transformer stage) to detect duplicates; that stage variable is then used in the output constraint expression. It can also be done with a hashed file stage (server jobs only) or an aggregator stage.

In parallel jobs there is an explicit "remove duplicates" stage.



Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
Post Reply