To get the latest row based on the key columns i am using sort with remove duplicates stage, however its working for some records but not all, also seeing dups.
Input--->Sort--->remove Dups--->Output
Here is the data..
COL1 COL2 COL3 COL4 COL5
1 02/01/2015 02/28/2015 02/18/2015 01
1 03/01/2015 12/31/9999 02/18/2015 02
2 01/01/2015 02/28/2015 02/18/2015 01
3 01/01/2015 01/30/2015 02/18/2015 01
3 02/01/2015 12/31/9999 02/18/2015 02
3 03/01/2015 12/31/2015 02/18/2015 03
expected Output..
1 03/01/2015 12/31/9999 02/18/2015 02
2 01/01/2015 02/28/2015 02/18/2015 01
3 03/01/2015 12/31/2015 02/18/2015 03
In sort stage..partitiong set to Auto
Sorting Keys
key=COL1
Sort Key Mode=sort
Sort Order=Ascending
key=COL4
Sort Key Mode=sort
Sort Order=Ascending
key=COL2
Sort Key Mode=sort
Sort Order=Descending
key=COL4
Sort Key Mode=sort
Sort Order=Descending
In Remove Dups stage..partitiong set to Auto
Keys that define duplicates
Key=COl1
Duplicates to retain=First
I am not sure where i was wrong..please point me in right direction.
Thanks in advance.
Sort with remove Dups
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 91
- Joined: Wed Apr 20, 2005 7:59 pm
- Location: U.S.
-
- Participant
- Posts: 91
- Joined: Wed Apr 20, 2005 7:59 pm
- Location: U.S.