Remove duplicates-Unique only

Ush · Post by **Ush** » Thu Jan 24, 2008 2:25 am

Hi

Is it possible to use remove duplicates stage to extract unique records.I am requirement analysis phase.I do not have DS parallel.

ex:
Emp id
1
1
2
3
3

only 2 should go.In the manual i could find only first and last options

Thanks

AmeyJoshi14 · Post by **AmeyJoshi14** » Thu Jan 24, 2008 3:36 am

I think by remove duplicate you can not get the desired result...

You can achieve this by using Aggregator + Filter stage.

source ----> aggregator ( on Emp_id + Aggregation Type--Count Rows ) ---> filter ( count = 1 ) ---> sequencial(target)

ray.wurlod · Post by **ray.wurlod** » Thu Jan 24, 2008 7:35 am

Remove Duplicates stage will remove duplicates without needing any other stage.

You do, however, need to ensure that the data are key-partitioned and sorted on the keys being used to identify duplicates.