Pivoting variable number of colums to rows

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
manish1005
Participant
Posts: 39
Joined: Thu Nov 23, 2006 11:23 pm

Pivoting variable number of colums to rows

Post by manish1005 »

Hi,

I have to split some colums of input data into multiple rows . I searched for this in this forum but pivot is mentioned in all posts which I suppose do not work for variable number columns without any pre-defined upper limit. In other posts its recommended to write a custom routine for this task.
I just wanted to know if there is some other alternative to do it without writing customized routines.

Input:
2007-09-21,4162,2,3080 2980 2040 1920 1927,218.4 20.4 117.6 50.4 579.6,0 0 0 0 0
2007-09-21,586,1,4263 4270 171 141 4023 3778 3886 2973 2905 2886,132.0 214.8 288.0 298.8 114.0 50.4 248.4 85.2 19.2 369.6,0 0 0 0 0 0 0 0 0 0
So here column delimiter is ',' and delimiter for values within colums 4,5&6 is single space(' ').
output should be:
2007-09-21,4162,2,3080 ,218.4 ,0
2007-09-21,4162,2,2980 ,20.4 ,0
2007-09-21,4162,2,2040 ,117.6,0
2007-09-21,4162,2,1920 ,50.4,0
2007-09-21,4162,2,1927 ,579.6,0
2007-09-21,586,1,4263 ,132.0 ,0
2007-09-21,586,1,4270 ,214.8 ,0
2007-09-21,586,1,171 141 , 288.0 ,0
.......
So I am choosing one value from col4 and corresponding values from col5 & col6 also. Columns 4, 5 & 6 will always have equal number of sub-columns but number of these sub-columns is variable without any pre-defined upper limit.

Please suggest.
Last edited by manish1005 on Fri Sep 21, 2007 10:23 am, edited 1 time in total.
srinagesh
Participant
Posts: 125
Joined: Mon Jul 25, 2005 7:03 am

Post by srinagesh »

Hi,

I am unable to give you a datastage solution diagram for this problem, due to lack of time, but here is the logic

Code: Select all

Input:   a, b, c, 4, 5, 6
            p, q, r, 4, 5, 6
Assume that a,b,c are your primary keys and 4, 5, 6 are 4th,5th & 6th columns (structure of these columns will be 4.1 4.2 4.3.. 4.n for 4th column , 5.1..5.n for 5th column , 6.1.. 6.n for 6th column)

Use a row splitter to split the row as

Code: Select all

a,b,c,4
a,b,c,5
a,b,c,6
p,q,r,4
p,q,r,5
p,q,r,6
use pivot to get

Code: Select all

a,b,c,4.1,1
a,b,c,4.2,2
a,b,c,5.1,1
a,b,c,5.2,2
a,b,c,6.1,1
a,b,c,6.2,2
p,q,r,4.1,1
p,q,r,4.2,2
p,q,r,4.3,3
p,q,r,5.1,1
p,q,r,5.2,2
p,q,r,5.3,3
p,q,r,6.1,1
p,q,r,6.2,2
p,q,r,6.3,3
where the last column (with values 1, 2, 3...) is the number of the value in that column. (ie., if 4th column is 10,20,30... the output will be 10,1 20,2 30,3 )

Now everything is in flat structure

Use the 1st, 2nd, 3rd and 5th columns as key and merge the data

You should get

Code: Select all

a,b,c,4.1,5.1,6.1
a,b,c,4.2,5.2,6.2
p,q,r,4.1,5.1,6.1
p,q,r,4.2,5.2,6.2
p,q,r,4.3,5.3,6.3
I will try to put the datastage diagram sometime soon, but till then, hope this helps

-Nagesh
Post Reply