Hello All,
in my source i am having duplicates,if any duplicates in the source i want reject those two records
kindly help me on that
Thanks
Remove duplicates
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 783
- Joined: Mon Jan 16, 2006 10:17 pm
- Location: Sydney, Australia
-
- Premium Member
- Posts: 783
- Joined: Mon Jan 16, 2006 10:17 pm
- Location: Sydney, Australia
-
- Participant
- Posts: 70
- Joined: Thu Nov 09, 2006 2:14 am
Re: Remove duplicates
Uppalapati
The Remove Duplicates doesn't have a reject option, nor does the sort stage with remove duplicates checked.
To capture rejected duplicates use a Transformer. Partition and sort on your primary key. In a transformer keep the primary key stored in a Stage Variable. Compare incoming primary key to the stored primary key Stage Variable. If it is the same output the incoming row as a duplicate, if it is different output the row as unique and save the new primary key.
You need at least two stage variables, one to do the comparison and the other to store the key value:
Variable: Derivation
IsDuplicate: input.keyfield = SavedKey
SavedKey: input.keyfield
- Use Sort stages instead of Remove duplicate stages. Sort stage has got more grouping options and sort indicator options.
sort the records using the key field.In sort stage put "key change column = true".Then zero will be assigned to the duplicate records.then put a condition as which is record is zero then send it to reject link
The Remove Duplicates doesn't have a reject option, nor does the sort stage with remove duplicates checked.
To capture rejected duplicates use a Transformer. Partition and sort on your primary key. In a transformer keep the primary key stored in a Stage Variable. Compare incoming primary key to the stored primary key Stage Variable. If it is the same output the incoming row as a duplicate, if it is different output the row as unique and save the new primary key.
You need at least two stage variables, one to do the comparison and the other to store the key value:
Variable: Derivation
IsDuplicate: input.keyfield = SavedKey
SavedKey: input.keyfield
uppalapati2003 wrote:Hello All,
in my source i am having duplicates,if any duplicates in the source i want reject those two records
kindly help me on that
Thanks
-
- Participant
- Posts: 70
- Joined: Thu Nov 09, 2006 2:14 am
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
You need a "fork join" design. Use a Copy stage to send the first column through an aggregator to get counted, then join back to the detail rows with a Join stage. You will have the count along with each detail row. Then filter based on the value of the count.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.