how to remove all duplicate records
Posted: Mon May 18, 2009 3:13 am
Hi All,
If the scenario is to remove all the records having duplicate key,the remove duplicate stage or the sort stage removes all except one.But i want to remove all records. Can it be done in Datastage?
Ex:
Col1 Col2 Col3
1 001 002
1 001 003
2 001 004
3 002 003
In this example, Col1 and Col2 are the key for removing duplicates.
I want to remove both the records having duplicate values for these columns.
Can it be implemented in DS or I need to write shell script for the same?
Thanks in adv
If the scenario is to remove all the records having duplicate key,the remove duplicate stage or the sort stage removes all except one.But i want to remove all records. Can it be done in Datastage?
Ex:
Col1 Col2 Col3
1 001 002
1 001 003
2 001 004
3 002 003
In this example, Col1 and Col2 are the key for removing duplicates.
I want to remove both the records having duplicate values for these columns.
Can it be implemented in DS or I need to write shell script for the same?
Thanks in adv