Page 1 of 1

How to remove duplicate & capture removed records in fil

Posted: Tue Feb 08, 2005 12:26 am
by akash_nitj
Hi Techies
Is it possible in some way in datastage where we reject the duplicate records and also capture the duplicate records in some file

Remove Duplicate stage doesn't have a reject link??

Any other easy way out.....
TIA
akash

Posted: Tue Feb 08, 2005 2:46 am
by ArndW
Morning Akash,

as you have noticed, the remove duplicates stage does only that and won't allow a second reject output link. If your input data stream is sorted then I would use a transform stage and use stage variables to detect whether or not you have a duplicate row

i.e.:

Code: Select all

CurrentCompareString = {concatenated list of columns to use for comparison}
DuplicateRecord = IF LastCompareString = CurrentCompareString THEN @TRUE ELSE @FALSE
LastCompareString = CurrentCompareString
Then use constraints with the logical value of "DuplicateRecord"

Another option would be to use the CDC stage and two different SELECTs (one with UNIQUE) on the source Data...