How to remove duplicates in Datastage server edition 8
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 102
- Joined: Tue Jan 31, 2006 4:13 am
How to remove duplicates in Datastage server edition 8
How to remove the duplicate records in Datastage server edition jobs? Which stage I need to use? Is there any stage like 'remove duplicates' like in parallel extender?
Sujatha K
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Write them to a hashed file using the relevant keys. Last one in wins. The Hashed File stage implements destructive overwrite of keys.
Or use a Transformer stage with stage variables to detect change (or lack of) in sorted input.
Or use a Transformer stage with stage variables to detect change (or lack of) in sorted input.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Hi,
In Serveredition 8 no special stage for remove duplicates.
To remove the duplicates on column by using the stage variables by,
SvCurr=SvPre
link1.col1=SvCurr
If SvPre=SvCurr then @FALST else @TRUE .
Put this result in one more stage variable.
and Place this stage variable in the constraint.
Before this you need to sort the data bases on the column by using Sort stage.
Regards,
Arshi
In Serveredition 8 no special stage for remove duplicates.
To remove the duplicates on column by using the stage variables by,
SvCurr=SvPre
link1.col1=SvCurr
If SvPre=SvCurr then @FALST else @TRUE .
Put this result in one more stage variable.
and Place this stage variable in the constraint.
Before this you need to sort the data bases on the column by using Sort stage.
Regards,
Arshi
-
- Participant
- Posts: 5
- Joined: Thu May 22, 2008 11:50 am
Re: How to remove duplicates in Datastage server edition 8
As previously written, write to a Hash File. Or you can use an Aggregator Stage.sujaoschin wrote:How to remove the duplicate records in Datastage server edition jobs? Which stage I need to use? Is there any stage like 'remove duplicates' like in parallel extender?
HTH
-
- Premium Member
- Posts: 102
- Joined: Tue Jan 31, 2006 4:13 am