duplicates

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
laxmi_etl
Charter Member
Charter Member
Posts: 117
Joined: Thu Sep 28, 2006 9:10 am

duplicates

Post by laxmi_etl »

Hi,

I need a help on writing duplicate records to a file.

I know we have some duplicates in the source file,
when I use remove duplicate stage I found some duplicates are there in the source file.

But my question is there any way we can write those to a file.


Thanks
csrazdan
Participant
Posts: 127
Joined: Wed May 12, 2004 6:03 pm
Location: Chicago IL

Post by csrazdan »

You can also perform this activity using SORT stage. Add a SORT stage to you job design. and sort it based on your key. Add SORT stage property Create Key Change Column. This property will add a column to your output link. The value of this column is 1 for the first record in the sort record group else the value is 0.

Hope it helps........
Assume everything I say or do is positive
sud
Premium Member
Premium Member
Posts: 366
Joined: Fri Dec 02, 2005 5:00 am
Location: Here I Am

Post by sud »

csrazdan wrote:You can also perform this activity using SORT stage. Add a SORT stage to you job design. and sort it based on your key. Add SORT stage property Create Key Change Column. This property will add a column to your output link. The value of this column is 1 for the first record in the sort record group else the value is 0.

Hope it helps........
yes ... the keychange = 0 are the duplicate records which can be filtered using a constraint.
It took me fifteen years to discover I had no talent for ETL, but I couldn't give it up because by that time I was too famous.
swades
Premium Member
Premium Member
Posts: 323
Joined: Mon Dec 04, 2006 11:52 pm

Post by swades »

Give output from Sort Stage to Filter Stage in that you can specify KeyChange=1 in Where Clause and In Option set Output Rejects=True(stretch 1 reject link from Filter Stage) That way you will be collecting duplicates in Rejected Link
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

...and you could have gotton that answer just by a simple search. A similar post was answered today.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Post Reply