Page 1 of 1

Identify the duplicates

Posted: Thu May 27, 2010 3:23 am
by nagarjuna900
Hi,

the Source data as follows:

Col1 Col2

1 aa
2 gg
1 aa
2 gg

Required output:

Col1 Col2
1 aa
2 gg

need to identify the duplicate record existance. based on the first row if any duplicates occurs the error message needs to send into error table, if the duplicate does not exist the record will load into the table. ( There is no removal of duplicates).

Thanks,

Posted: Thu May 27, 2010 3:28 am
by gssr
Use Aggregate stage to count rows on key column,
In Transformer ,if it is graeter than one ,load it error table else in the table you required

Re: Identify the duplicates

Posted: Thu May 27, 2010 5:18 am
by mayura
nagarjuna900 wrote:Hi,

the Source data as follows:

Col1 Col2

1 aa
2 gg
1 aa
2 gg

Required output:

Col1 Col2
1 aa
2 gg

need to identify the duplicate record existance. based on the first row if any duplicates occurs the error message needs to send into error table, if the duplicate does not exist the record will load into the table. ( There is no removal of duplicates).

Thanks,

Use remove duplicate stage or if you are using any lookup stage after file then in partition click on perform by sort and click on unique.you will get only good records.

Thanks,
Mayrua :idea:

Posted: Thu May 27, 2010 11:10 am
by ray.wurlod
The original requirement explicitly stated a desire NOT to remove duplicates. gssr's solution is apposite.