Latest record to survive
-
- Participant
- Posts: 150
- Joined: Tue Mar 13, 2007 1:17 am
Latest record to survive
Hi,
I have a requirement to survive the most frequent non blank value, incase of tie the latest record based on a date value column should survive.
not able to find a way to pick the latest record value when its a tie. any suggestions ?
Thanks
I have a requirement to survive the most frequent non blank value, incase of tie the latest record based on a date value column should survive.
not able to find a way to pick the latest record value when its a tie. any suggestions ?
Thanks
-
- Participant
- Posts: 117
- Joined: Wed Feb 06, 2013 9:24 am
- Location: Chennai,TN, India
-
- Participant
- Posts: 150
- Joined: Tue Mar 13, 2007 1:17 am
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 150
- Joined: Tue Mar 13, 2007 1:17 am
I have already tried to define expressions like c.timestamp > c.timestamp and also like b.timestamp > b.timestamp but nothing fetches the desired results. the only options available is greater than but not like greatest in version 8.7
An alternate to this could be to sort the data based on the key column and the timestamp column and generate a string based on the previous values and the current values and in survivor stage select the data based on the longest string value.
I was hoping that there could be more clear way to define the greatest value.
An alternate to this could be to sort the data based on the key column and the timestamp column and generate a string based on the previous values and the current values and in survivor stage select the data based on the longest string value.
I was hoping that there could be more clear way to define the greatest value.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 150
- Joined: Tue Mar 13, 2007 1:17 am
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Not really. Keep in mind what a survive rule is doing; it's specifying a condition under which a field in the currect (c) record will replace that field in the so-far-the-best (b) record. Median couldn't easily be supported, particularly when many steps are involved in the survive rule, since the median isn't available till all records in the block have been processed, but the comparisons are done one record at a time (the b record is initialized from the master).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 150
- Joined: Tue Mar 13, 2007 1:17 am
I think I have got the solution. Run the job in sequential mode and sort the data based on the key and the timestamp column so that that latest record is processed as the last record. whenever there is a tie survive stage seems to pick the value from the last record(can't bet my life on this ) but this is what a quick test has shown.
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
You don't need sequential mode if you partition on the key and sort on the key and timestamp column. The partitioning will keep groups of records on the same node.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
-
- Participant
- Posts: 150
- Joined: Tue Mar 13, 2007 1:17 am