Page 1 of 1

Pivoting variable number of colums to rows

Posted: Fri Sep 21, 2007 5:32 am
by manish1005
Hi,

I have to split some colums of input data into multiple rows . I searched for this in this forum but pivot is mentioned in all posts which I suppose do not work for variable number columns without any pre-defined upper limit. In other posts its recommended to write a custom routine for this task.
I just wanted to know if there is some other alternative to do it without writing customized routines.

Input:
2007-09-21,4162,2,3080 2980 2040 1920 1927,218.4 20.4 117.6 50.4 579.6,0 0 0 0 0
2007-09-21,586,1,4263 4270 171 141 4023 3778 3886 2973 2905 2886,132.0 214.8 288.0 298.8 114.0 50.4 248.4 85.2 19.2 369.6,0 0 0 0 0 0 0 0 0 0
So here column delimiter is ',' and delimiter for values within colums 4,5&6 is single space(' ').
output should be:
2007-09-21,4162,2,3080 ,218.4 ,0
2007-09-21,4162,2,2980 ,20.4 ,0
2007-09-21,4162,2,2040 ,117.6,0
2007-09-21,4162,2,1920 ,50.4,0
2007-09-21,4162,2,1927 ,579.6,0
2007-09-21,586,1,4263 ,132.0 ,0
2007-09-21,586,1,4270 ,214.8 ,0
2007-09-21,586,1,171 141 , 288.0 ,0
.......
So I am choosing one value from col4 and corresponding values from col5 & col6 also. Columns 4, 5 & 6 will always have equal number of sub-columns but number of these sub-columns is variable without any pre-defined upper limit.

Please suggest.

Posted: Fri Sep 21, 2007 7:03 am
by srinagesh
Hi,

I am unable to give you a datastage solution diagram for this problem, due to lack of time, but here is the logic

Code: Select all

Input:   a, b, c, 4, 5, 6
            p, q, r, 4, 5, 6
Assume that a,b,c are your primary keys and 4, 5, 6 are 4th,5th & 6th columns (structure of these columns will be 4.1 4.2 4.3.. 4.n for 4th column , 5.1..5.n for 5th column , 6.1.. 6.n for 6th column)

Use a row splitter to split the row as

Code: Select all

a,b,c,4
a,b,c,5
a,b,c,6
p,q,r,4
p,q,r,5
p,q,r,6
use pivot to get

Code: Select all

a,b,c,4.1,1
a,b,c,4.2,2
a,b,c,5.1,1
a,b,c,5.2,2
a,b,c,6.1,1
a,b,c,6.2,2
p,q,r,4.1,1
p,q,r,4.2,2
p,q,r,4.3,3
p,q,r,5.1,1
p,q,r,5.2,2
p,q,r,5.3,3
p,q,r,6.1,1
p,q,r,6.2,2
p,q,r,6.3,3
where the last column (with values 1, 2, 3...) is the number of the value in that column. (ie., if 4th column is 10,20,30... the output will be 10,1 20,2 30,3 )

Now everything is in flat structure

Use the 1st, 2nd, 3rd and 5th columns as key and merge the data

You should get

Code: Select all

a,b,c,4.1,5.1,6.1
a,b,c,4.2,5.2,6.2
p,q,r,4.1,5.1,6.1
p,q,r,4.2,5.2,6.2
p,q,r,4.3,5.3,6.3
I will try to put the datastage diagram sometime soon, but till then, hope this helps

-Nagesh