Eliminate Duplicate data
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 2
- Joined: Thu Feb 17, 2011 12:11 am
Eliminate Duplicate data
How do we eliminated duplicate data using stage variables in transformer in datastage?
-
- Participant
- Posts: 148
- Joined: Thu Apr 10, 2008 12:47 am
-
- Participant
- Posts: 2
- Joined: Thu Feb 17, 2011 12:11 am
-
- Participant
- Posts: 148
- Joined: Thu Apr 10, 2008 12:47 am
you never said dupicate to be captured...
there are many ways but one which i have tried ans tested
sort the data using sort stage on key column which decide your duplicate..
then aggregate on same key.
so you will have
column value and count...
now inner join input file with one the output from aggregator...
then put x'mer giving two o/p file
constaint is count>2 should give only unique otherwise duplicate...
use partion method carefully....
in sot hash partition in same order as sorting on key column
aggrator should be with same partiton
but in join use hash on both the links....
there are many ways but one which i have tried ans tested
sort the data using sort stage on key column which decide your duplicate..
then aggregate on same key.
so you will have
column value and count...
now inner join input file with one the output from aggregator...
then put x'mer giving two o/p file
constaint is count>2 should give only unique otherwise duplicate...
use partion method carefully....
in sot hash partition in same order as sorting on key column
aggrator should be with same partiton
but in join use hash on both the links....
-
- Participant
- Posts: 527
- Joined: Thu Apr 19, 2007 1:25 am
- Location: Melbourne