How to remove Duplicate Record ?

3 posts • Page 1 of 1

ray.wurlod: Participant; Posts: 54607; Joined: Wed Oct 23, 2002 10:52 pm; Location: Sydney, Australia; Contact:
Contact ray.wurlod

Website

Quote

Post by ray.wurlod » Tue Jun 24, 2003 5:41 am

Are you using a server job, a mainframe job or a parallel job?

In server and mainframe jobs, the easiest way is to have a sorted input stream and use a stage variable (in a Transformer stage) to detect duplicates; that stage variable is then used in the output constraint expression. It can also be done with a hashed file stage (server jobs only) or an aggregator stage.

In parallel jobs there is an explicit "remove duplicates" stage.

Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518

Post Reply

3 posts • Page 1 of 1

Return to “IBM<sup>Â®</sup> Infosphere DataStage Server Edition”