All,
Need one help on rows to column spilling in datastage. It is a parallel environment where source is db2 database. Basically we have four columns,
userid,begiinigdatettime,status.endingdatetime,duration.
Sam,2014-04-12 7.00am, on call,2014-04-12,7.30am, 1800
Sam,2014-04-12 7.30 am,busy,2014-04-12,8.00am,1800
Sam,2014-04-12 8.00 am, free to talk,2014-04-12,8.15am,900
- it will be going on until the user logged out.
We basically want the data looks like,
userid,beginningdatetime,endingdatettime,on call,busy,free to talk.
Sam, 2014-04-12 7.00am,2014-04-12,8.15am,00.30.00,00.30.00,00.15.00
Any help what kind of stage can we use to achieve it. I used pivot enterprise, but its getting more critical on it.
Thanks.
Rows to column split in Datastage.
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Pivot Enterprise would seem to me to be the correct stage to use. What's your actual problem?
Without creating a custom (Build) stage, you might be able to leverage the looping capability in the Transformer stage.
In all of the above, I assume you are processing after the event. What you seem to be trying to do would be very difficult to accomplish in real time.
Without creating a custom (Build) stage, you might be able to leverage the looping capability in the Transformer stage.
In all of the above, I assume you are processing after the event. What you seem to be trying to do would be very difficult to accomplish in real time.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Thanks Ray. It is quite tough for sure.
In Pivot stage, if use array of 3 for on call,busy,free to talk. its repeating the same for next other rows which does not make sense. )pivoting on status.
Its coming like,
userid,beginningdatetime,endingdatettime,status1,status2,status3
sam,2014-04-12 7.00am,2014-04-12,7.30am,on call,busy,free to talk
sam,2014-04-12 8.30 am,2014-04-9.00,on break,available,on call
sam,2014-04-12 9.00am, 2014-04-12 10.00am, busy,on call, free to talk
if I pivot on beginningdatettime and enddatetime as well, then will get one extra column like,
userid,beginningdatetime1,status1,enddatetime1,beginningdatetime2,status2,enddatetime2,beginningdatetime3,status3,enddatetime3
Probably we need to group by the status which then calculating time would be easier.
In Pivot stage, if use array of 3 for on call,busy,free to talk. its repeating the same for next other rows which does not make sense. )pivoting on status.
Its coming like,
userid,beginningdatetime,endingdatettime,status1,status2,status3
sam,2014-04-12 7.00am,2014-04-12,7.30am,on call,busy,free to talk
sam,2014-04-12 8.30 am,2014-04-9.00,on break,available,on call
sam,2014-04-12 9.00am, 2014-04-12 10.00am, busy,on call, free to talk
if I pivot on beginningdatettime and enddatetime as well, then will get one extra column like,
userid,beginningdatetime1,status1,enddatetime1,beginningdatetime2,status2,enddatetime2,beginningdatetime3,status3,enddatetime3
Probably we need to group by the status which then calculating time would be easier.
In filter stage split in status. So on call, busy, free to talk will go on three links.
In another filter link use condition (on call or busy or free to talk) and send userid down that link. Three rows will go down this link. Then in remove duplicate stage dedup it to only one row Sam.
Then join this link with the earlier links for on call, busy and free to talk base on userid. Then the data would be in one row.
You can then do the calculation for endingdatetime_busy - endingdatetime_oncall, endingdatetime_freetotalk - endingdatetime_busy.
In another filter link use condition (on call or busy or free to talk) and send userid down that link. Three rows will go down this link. Then in remove duplicate stage dedup it to only one row Sam.
Then join this link with the earlier links for on call, busy and free to talk base on userid. Then the data would be in one row.
You can then do the calculation for endingdatetime_busy - endingdatetime_oncall, endingdatetime_freetotalk - endingdatetime_busy.
Okay,I tried this way. So, I changed the sql query and it is pulling data like,
userid,time_spent,status,date
sam,0:15:12,Free to talk,2013-04-20
sam,1:15:22,Busy,2013-04-20
sam,00:47:15,available,2013-04-20
sam,04:15:11,out of desk,2013-04-20
jim,02:10:15,busy.2013-04-20
jim,00:15:10,free to talk,2013-04-20
Now, in final out it should look like,
userid,free to talk,busy,available,outofdesk
sam,00:15:12,01:15:22,,00:47:15,00:00:00
jim,00:15:10,02:10:15,00:00:00,00:00:00
My only doubt is , can we achieve this with PIVOT enterprise. Kind of confused....
Thanks again.
userid,time_spent,status,date
sam,0:15:12,Free to talk,2013-04-20
sam,1:15:22,Busy,2013-04-20
sam,00:47:15,available,2013-04-20
sam,04:15:11,out of desk,2013-04-20
jim,02:10:15,busy.2013-04-20
jim,00:15:10,free to talk,2013-04-20
Now, in final out it should look like,
userid,free to talk,busy,available,outofdesk
sam,00:15:12,01:15:22,,00:47:15,00:00:00
jim,00:15:10,02:10:15,00:00:00,00:00:00
My only doubt is , can we achieve this with PIVOT enterprise. Kind of confused....
Thanks again.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: