Page 1 of 1

datastage

Posted: Thu Feb 15, 2007 10:11 pm
by prasad.bodduluri
how fine duplicate records using transformer stage in server edition
[/b]

Posted: Thu Feb 15, 2007 10:30 pm
by narasimha
Please elaborate on your requirement.

Posted: Thu Feb 15, 2007 10:58 pm
by balajisr
Did you ask for finding duplicates using transformer?

Sort the data and use Stage variables in the transformer to find duplicates.Search this forum. This has been discussed many times.

Re: datastage

Posted: Fri Feb 16, 2007 12:22 am
by sudeepmantri
Hi, If u click stage property button in Transformer u'll be presented with a dialogue box. Select the input tab. Select the partitioning u wanna use (Most probably Hash). Select 2 check boxes Sort, unique. Select the key column on which u wanna perform the Sorting....That's it. It will eliminate any duplicate records based on the Key columns u have selected

Re: datastage

Posted: Fri Feb 16, 2007 12:44 am
by balajisr
sudeepmantri wrote:Hi, If u click stage property button in Transformer u'll be presented with a dialogue box. Select the input tab. Select the partitioning u wanna use (Most probably Hash). Select 2 check boxes Sort, unique. Select the key column on which u wanna perform the Sorting....That's it. It will eliminate any duplicate records based on the Key columns u have selected
The above option is not available in server edition.

Posted: Fri Feb 16, 2007 2:42 am
by aakashahuja
HI,

In server you can achieve this by using some stage vars. For ex.

Sort the source data.

Orig=key1

Check= if Orig=Dup Then 1 Else 0

Dup=Orig

Assume that you have two o/p links, one for normal data / the other for dupliacte data. Then on the link for correct data, set the constrain to

Ckeck=0 And @INROWNUM>1

On the duplicate link, set the constarint to

Check=1

Then at the end, you will be required to pass the last row of your source file to the correct data file.

Hope the psuedo code works for you.

Thanks
Aakash