Remove duplicates-Unique only

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Ush
Participant
Posts: 55
Joined: Tue Dec 04, 2007 3:15 am

Remove duplicates-Unique only

Post by Ush »

Hi

Is it possible to use remove duplicates stage to extract unique records.I am requirement analysis phase.I do not have DS parallel.

ex:
Emp id
1
1
2
3
3

only 2 should go.In the manual i could find only first and last options

Thanks
AmeyJoshi14
Participant
Posts: 334
Joined: Fri Dec 01, 2006 5:17 am
Location: Texas

Post by AmeyJoshi14 »

I think by remove duplicate you can not get the desired result...
:idea: You can achieve this by using Aggregator + Filter stage. :wink:

source ----> aggregator ( on Emp_id + Aggregation Type--Count Rows ) ---> filter ( count = 1 ) ---> sequencial(target)
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Remove Duplicates stage will remove duplicates without needing any other stage.

You do, however, need to ensure that the data are key-partitioned and sorted on the keys being used to identify duplicates.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply