Page 1 of 1

Remove Duplicates -Rejected Records

Posted: Wed Mar 12, 2008 4:26 am
by Ush
Hi

I have the following set of records:

Empno name
1 Ash
1 Ush
2 Reeta
3 x
4 Y

I have retain first record and capture rejected records...I cant use remove duplicates since it does not have reject link.

Please help

Posted: Wed Mar 12, 2008 4:34 am
by Nripendra Chand
you can use stage variables in transformer stage to get this result. make sure that records are hash partitioned and sorted on the required keys before stage variable logic.

Posted: Wed Mar 12, 2008 4:41 am
by ArndW
Sort the data on your key field, then use a transform stage that stores the last record key value in a stage variable and use the stage constraints to output records accordingly

Re: Remove Duplicates -Rejected Records

Posted: Wed Mar 12, 2008 4:53 am
by Nirmala84
[quote="Ush"]Hi

I have the following set of records:

Empno name
1 Ash
1 Ush
2 Reeta
3 x
4 Y

I have retain first record and capture rejected records...I cant use remove duplicates since it does not have reject link.

Please help[/quote]


Also,

Help me in fetching only the Unique records out of the remove duplicate stage.

For eg,

If I have the following set of records:

Empno name
1 Ash
1 Ush
2 Reeta
3 x
4 Y

The result set out of remove duplicates should contain the following set of records:

Empno
1
2
3
4

Please help.

Posted: Wed Mar 12, 2008 4:57 am
by ArndW
Nirmala84 - the same method of using stage variables applies.

Posted: Wed Mar 12, 2008 6:56 am
by ccatania
Remove duplicate stage can retain first or last duplicate record.

Posted: Wed Mar 12, 2008 12:31 pm
by r_arora
here is a suggestion:
Use a sort stage..sort on EmpNo and make the clusterKeyChange value "True". Then put a constraint on your transformer where all records having clusterKeyChangeValue 1 should go in one dataset and the other to the other dataset. You will get 2 datasets..one with all unique employee nos and the other having all the duplicate records.