Identify the duplicates

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
nagarjuna900
Participant
Posts: 35
Joined: Mon Dec 29, 2008 2:22 am
Location: chennai

Identify the duplicates

Post by nagarjuna900 »

Hi,

the Source data as follows:

Col1 Col2

1 aa
2 gg
1 aa
2 gg

Required output:

Col1 Col2
1 aa
2 gg

need to identify the duplicate record existance. based on the first row if any duplicates occurs the error message needs to send into error table, if the duplicate does not exist the record will load into the table. ( There is no removal of duplicates).

Thanks,
gssr
Participant
Posts: 243
Joined: Fri Jan 09, 2009 12:51 am
Location: India

Post by gssr »

Use Aggregate stage to count rows on key column,
In Transformer ,if it is graeter than one ,load it error table else in the table you required
RAJ
mayura
Participant
Posts: 40
Joined: Fri Aug 01, 2008 5:58 am
Location: Mumbai

Re: Identify the duplicates

Post by mayura »

nagarjuna900 wrote:Hi,

the Source data as follows:

Col1 Col2

1 aa
2 gg
1 aa
2 gg

Required output:

Col1 Col2
1 aa
2 gg

need to identify the duplicate record existance. based on the first row if any duplicates occurs the error message needs to send into error table, if the duplicate does not exist the record will load into the table. ( There is no removal of duplicates).

Thanks,

Use remove duplicate stage or if you are using any lookup stage after file then in partition click on perform by sort and click on unique.you will get only good records.

Thanks,
Mayrua :idea:
Mayura
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The original requirement explicitly stated a desire NOT to remove duplicates. gssr's solution is apposite.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply