Calculating Days
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 49
- Joined: Wed Dec 13, 2006 6:31 pm
Calculating Days
Hi all, my data is in the following format.
Id Status Date
1 28 2007-09-24
1 5 2007-08-20
1 11 2007-09-15
1 27 2007-08-22
2 27 2007-09-22
2 5 2007-08-19
2 4 2007-08-20
2 1 2007-08-25
I need to calculate the number of days between the status changes. For example if status of Id 1 changes from 5 to 27, 27 to 28, 5 to 11 or any change from one status to another i have to calculate the number of days. The problem is the status can change in any order, so the ascending or descending order will not work. Could someone please give me an idea as to how i can work on this. I am not looking for the iconv or oconv stuff but how i can arrange the Status to calculate the days.
Thank You All
Id Status Date
1 28 2007-09-24
1 5 2007-08-20
1 11 2007-09-15
1 27 2007-08-22
2 27 2007-09-22
2 5 2007-08-19
2 4 2007-08-20
2 1 2007-08-25
I need to calculate the number of days between the status changes. For example if status of Id 1 changes from 5 to 27, 27 to 28, 5 to 11 or any change from one status to another i have to calculate the number of days. The problem is the status can change in any order, so the ascending or descending order will not work. Could someone please give me an idea as to how i can work on this. I am not looking for the iconv or oconv stuff but how i can arrange the Status to calculate the days.
Thank You All
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Pre-sort by Id then by Status. That having been done, you can use stage variables in a Transformer stage to detect change, and Iconv() of the date(s) - which you can "remember" also using stage variables - to calculate the days difference.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 49
- Joined: Wed Dec 13, 2006 6:31 pm
Calculating Days
Is it possible after sorting if my data is in the following format.
ID Status Date
1 28 2007-08-20
1 27 2007-08-10
1 26 2007-07-10
1 25 2007-08-20
1 11 2007-07-15
1 10 2007-07-10
1 5 2007-07-05
1 4 2007-07-17
1 2 2007-07-12
1 1 2007-07-13
Now this is after i have sorted the ID in ascending and Status in descending orders. I need to calculate days for status change from 5 to 28, 5 to 27, 5 to 26, 5 to 25, 10 to 11, 4 to 5 (this pattern is different from the previous ones), 2 to 5, or 1 to 5. If the number of days is negative i should ignore. My question is even after sorting will the calculations be possible for all the status changes using stage variables. Please let me know.
Thank you!!
ID Status Date
1 28 2007-08-20
1 27 2007-08-10
1 26 2007-07-10
1 25 2007-08-20
1 11 2007-07-15
1 10 2007-07-10
1 5 2007-07-05
1 4 2007-07-17
1 2 2007-07-12
1 1 2007-07-13
Now this is after i have sorted the ID in ascending and Status in descending orders. I need to calculate days for status change from 5 to 28, 5 to 27, 5 to 26, 5 to 25, 10 to 11, 4 to 5 (this pattern is different from the previous ones), 2 to 5, or 1 to 5. If the number of days is negative i should ignore. My question is even after sorting will the calculations be possible for all the status changes using stage variables. Please let me know.
Thank you!!
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
So... not strictly chronological as I assumed where all you need is the number of days between any two status changes. Rather, for each Id you need a series of day counts between specific status changes? And when you said:
"I need to calculate days for status change from 5 to 28, 5 to 27, 5 to 26, 5 to 25, 10 to 11, 4 to 5 (this pattern is different from the previous ones), 2 to 5, or 1 to 5."
Did you mean "and 1 to 5", meaning as many of these as exist for a given Id? If so, with a finite number of possible status codes you care about per id (10 it seems) to store, I'd use a hashed file keyed by Id with a data field for each status holding a date. Then Read/Write to that same hashed file for each record, inserting if it doesn't exist yet for that Id and updating each time it does. In essence, a pivot.
When you are done, source from the hashed file as it will have all possible dates per Id at that point. Then calculate the date ranges for all valid status changes with dates.
"I need to calculate days for status change from 5 to 28, 5 to 27, 5 to 26, 5 to 25, 10 to 11, 4 to 5 (this pattern is different from the previous ones), 2 to 5, or 1 to 5."
Did you mean "and 1 to 5", meaning as many of these as exist for a given Id? If so, with a finite number of possible status codes you care about per id (10 it seems) to store, I'd use a hashed file keyed by Id with a data field for each status holding a date. Then Read/Write to that same hashed file for each record, inserting if it doesn't exist yet for that Id and updating each time it does. In essence, a pivot.
When you are done, source from the hashed file as it will have all possible dates per Id at that point. Then calculate the date ranges for all valid status changes with dates.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Premium Member
- Posts: 49
- Joined: Wed Dec 13, 2006 6:31 pm
-
- Premium Member
- Posts: 49
- Joined: Wed Dec 13, 2006 6:31 pm
Chulett, sorry about that. It is "and 1 to 5". In fact i have to calculate days for 20 status changes and i have mentioned 10 only. Excuse me here, but could you please explain this "Then Read/Write to that same hashed file for each record, inserting if it doesn't exist yet for that Id and updating each time it does. In essence, a pivot"chulett wrote: So... not strictly chronological as I assumed where all you need is the number of days between any two status changes. Rather, for each Id you need a series of day counts between specific status changes? And when you said:
"I need to calculate days for status change from 5 to 28, 5 to 27, 5 to 26, 5 to 25, 10 to 11, 4 to 5 (this pattern is different from the previous ones), 2 to 5, or 1 to 5."
Did you mean "and 1 to 5", meaning as many of these as exist for a given Id? If so, with a finite number of possible status codes you care about per id (10 it seems) to store, I'd use a hashed file keyed by Id with a data field for each status holding a date. Then Read/Write to that same hashed file for each record, inserting if it doesn't exist yet for that Id and updating each time it does. In essence, a pivot.
When you are done, source from the hashed file as it will have all possible dates per Id at that point. Then calculate the date ranges for all valid status changes with dates.
Thanks a lot.
Meaning, the same hashed file as both a lookup and your target.
Don't cache the lookup so the writes are immediately available. Every input record gets written to the hashed file. On a 'miss' you assemble a new record with all except one data field empty, on a 'hit' you write back all of the current values plus the new one in the proper slot.
Code: Select all
Hashed
|
Source -> Trn -> Hashed -> [calc days]
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers