Sort with remove Dups
Posted: Wed Feb 18, 2015 11:12 am
To get the latest row based on the key columns i am using sort with remove duplicates stage, however its working for some records but not all, also seeing dups.
Input--->Sort--->remove Dups--->Output
Here is the data..
COL1 COL2 COL3 COL4 COL5
1 02/01/2015 02/28/2015 02/18/2015 01
1 03/01/2015 12/31/9999 02/18/2015 02
2 01/01/2015 02/28/2015 02/18/2015 01
3 01/01/2015 01/30/2015 02/18/2015 01
3 02/01/2015 12/31/9999 02/18/2015 02
3 03/01/2015 12/31/2015 02/18/2015 03
expected Output..
1 03/01/2015 12/31/9999 02/18/2015 02
2 01/01/2015 02/28/2015 02/18/2015 01
3 03/01/2015 12/31/2015 02/18/2015 03
In sort stage..partitiong set to Auto
Sorting Keys
key=COL1
Sort Key Mode=sort
Sort Order=Ascending
key=COL4
Sort Key Mode=sort
Sort Order=Ascending
key=COL2
Sort Key Mode=sort
Sort Order=Descending
key=COL4
Sort Key Mode=sort
Sort Order=Descending
In Remove Dups stage..partitiong set to Auto
Keys that define duplicates
Key=COl1
Duplicates to retain=First
I am not sure where i was wrong..please point me in right direction.
Thanks in advance.
Input--->Sort--->remove Dups--->Output
Here is the data..
COL1 COL2 COL3 COL4 COL5
1 02/01/2015 02/28/2015 02/18/2015 01
1 03/01/2015 12/31/9999 02/18/2015 02
2 01/01/2015 02/28/2015 02/18/2015 01
3 01/01/2015 01/30/2015 02/18/2015 01
3 02/01/2015 12/31/9999 02/18/2015 02
3 03/01/2015 12/31/2015 02/18/2015 03
expected Output..
1 03/01/2015 12/31/9999 02/18/2015 02
2 01/01/2015 02/28/2015 02/18/2015 01
3 03/01/2015 12/31/2015 02/18/2015 03
In sort stage..partitiong set to Auto
Sorting Keys
key=COL1
Sort Key Mode=sort
Sort Order=Ascending
key=COL4
Sort Key Mode=sort
Sort Order=Ascending
key=COL2
Sort Key Mode=sort
Sort Order=Descending
key=COL4
Sort Key Mode=sort
Sort Order=Descending
In Remove Dups stage..partitiong set to Auto
Keys that define duplicates
Key=COl1
Duplicates to retain=First
I am not sure where i was wrong..please point me in right direction.
Thanks in advance.