datastage

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
prasad.bodduluri
Participant
Posts: 30
Joined: Tue Jan 30, 2007 5:21 am
Location: bangalore

datastage

Post by prasad.bodduluri »

how fine duplicate records using transformer stage in server edition
[/b]
prasad
narasimha
Charter Member
Charter Member
Posts: 1236
Joined: Fri Oct 22, 2004 8:59 am
Location: Staten Island, NY

Post by narasimha »

Please elaborate on your requirement.
Narasimha Kade

Finding answers is simple, all you need to do is come up with the correct questions.
balajisr
Charter Member
Charter Member
Posts: 785
Joined: Thu Jul 28, 2005 8:58 am

Post by balajisr »

Did you ask for finding duplicates using transformer?

Sort the data and use Stage variables in the transformer to find duplicates.Search this forum. This has been discussed many times.
sudeepmantri
Participant
Posts: 54
Joined: Wed Oct 25, 2006 11:07 pm
Location: Hyderabad

Re: datastage

Post by sudeepmantri »

Hi, If u click stage property button in Transformer u'll be presented with a dialogue box. Select the input tab. Select the partitioning u wanna use (Most probably Hash). Select 2 check boxes Sort, unique. Select the key column on which u wanna perform the Sorting....That's it. It will eliminate any duplicate records based on the Key columns u have selected
balajisr
Charter Member
Charter Member
Posts: 785
Joined: Thu Jul 28, 2005 8:58 am

Re: datastage

Post by balajisr »

sudeepmantri wrote:Hi, If u click stage property button in Transformer u'll be presented with a dialogue box. Select the input tab. Select the partitioning u wanna use (Most probably Hash). Select 2 check boxes Sort, unique. Select the key column on which u wanna perform the Sorting....That's it. It will eliminate any duplicate records based on the Key columns u have selected
The above option is not available in server edition.
aakashahuja
Premium Member
Premium Member
Posts: 210
Joined: Wed Feb 16, 2005 7:17 am

Post by aakashahuja »

HI,

In server you can achieve this by using some stage vars. For ex.

Sort the source data.

Orig=key1

Check= if Orig=Dup Then 1 Else 0

Dup=Orig

Assume that you have two o/p links, one for normal data / the other for dupliacte data. Then on the link for correct data, set the constrain to

Ckeck=0 And @INROWNUM>1

On the duplicate link, set the constarint to

Check=1

Then at the end, you will be required to pass the last row of your source file to the correct data file.

Hope the psuedo code works for you.

Thanks
Aakash
Post Reply