Page 1 of 2

Need to convert columns to rows

Posted: Fri Feb 17, 2012 1:35 am
by deepa.y
Hi,
I have the data under 2 columns
col1 col2
a 1
b xy
c 3

And the number of records under two columns differ from file to file.
I need to convert column data to records
i.e the output should be
a,b,c
1,xy,3 .

I tried this using vertical pivot but it is working only if there are fixed number if records.

Posted: Fri Feb 17, 2012 1:42 am
by kandyshandy
search this forum and you will see n number of posts.

Posted: Fri Feb 17, 2012 2:32 am
by deepa.y
Hi Kandy,
I searched here and i found that it can be done by vertical pivot.So,i implemented the same thing using Vertical pivot.But it works only if the number of records are known.
Please suggest me how to implement this in transformer stage by taking count of records.

Posted: Fri Feb 17, 2012 3:00 am
by kandyshandy
Deepa, Your sample (input/output) data does not look like vertical pivot scenario. Do you have a group by column or not?

You can achieve vertical pivot in transformer (using stage variables to determine group change & concatenate columns).

Read IBM doc for more clarity.

Posted: Fri Feb 17, 2012 3:11 am
by deepa.y
To use vertical pivot,before vertical pivot stage,in transformer i took a column called pivot assigned it to 1 and then used two pivot stages to pivot col1 seperately taking group by column as pivot and col2 seperately .Finally,using funnel stage, merged them to single file.

Posted: Fri Feb 17, 2012 3:17 am
by kandyshandy
That makes sense. Now, try using stage variables to achieve the same. In transformer method, you don't need count of records. All you need is to find when a group changes. Your case is pretty simple as the group does not change at all. i.e. always "1"

Posted: Fri Feb 17, 2012 3:31 am
by deepa.y
Hi Kandy,
Could you please explain me in detail?

Posted: Fri Feb 17, 2012 3:59 am
by deepa.y
Hi,
I implemented the following logic
I took a column pivot and assigned value 1.
Then in stage variables,i gave
current key=pivot
nextkey=currentkey
concat=if currentkey=nextkey then concat:lnk_input.col1 else ''
and it is giving me the following output.

,a
,a,b
,a,b,c--->need to read this row and remove comma infront.

in the constraint i cannot use pivot<>1 as pivot is always 1 in my case.

Posted: Fri Feb 17, 2012 4:02 am
by deepa.y
deepa.y wrote:concat=if currentkey=nextkey then concat:lnk_input.col1 else ''
Sorry there is mistake in my previous reply.
concat=if currentkey=nextkey then concat:","
:lnk_input.col1 else ''

Posted: Sun Feb 19, 2012 8:18 pm
by kandyshandy
Remove unwanted rows by adding another stage ;)

Posted: Sun Feb 19, 2012 10:43 pm
by mobashshar
Good. But how will you control the data partitioning? The data will go on various nodes??

Posted: Mon Feb 20, 2012 12:31 am
by kandyshandy
yeah.. OP will find that "another" stage to use. She will refer the IBM doc for that stage & that's when she will ensure data is partitioned ;) Her case is a simple one!!

Posted: Mon Feb 20, 2012 1:19 am
by deepa.y
kandyshandy wrote:Remove unwanted rows by adding another stage ;)
Hi Kandy,
Instead of dummy column pivot,i took a column called row_num
and assigned it to @inrownum system variable.
so that data from each file can have a unique key
i.e file1: ,a 1
,a,b 1
,a,b,c 1

file2: ,x 2
,x,y 2
,x,y,z 2

so that i can arrange them in descinding order and pick the first row for each key.
But when i read four files at a time,the @inrownum is not getting incremented.
the file1 is getting row_num 1
file2 2
file3 1
file4 2
that is row num is not incremented as 1,2,3,4 for my case to work.Please
suggest me on this.

Posted: Mon Feb 20, 2012 1:54 am
by kandyshandy
Deepa, Your took a complicated route now.. @INROWNUM is the incoming ROW number from each parition. That's tricky.

Your earlier approach was simple for your requirements. Just assign 1 to pivot column and process in transformer stage. When done, sort by your new (concatenated) column in desc order and take the first record by using the appropriate stage ;)

Posted: Mon Feb 20, 2012 2:50 am
by deepa.y
kandyshandy wrote:Just assign 1 to pivot column and process in transformer stage. When done, sort by your new (concatenated) column in desc order and take the first record by using the appropriate stage ;)
Hi kandy,
But the earlier approach is giving me the problem when readng more than 2 files.
During concatination,it is concatenating the data coming in the same partition.
for example,
file1: col1 col2 file2:col1 col2 file3: col1 col2
a 1 x 1 p 1
b 2 y 2 q 2
c 3 z 3 r 3

result file:
a
a,b
a,b,c
a,b,c,p
a,b,c,p,q
a,b,c,p,q,r

x
x,y
x,y,z

so,if i sort in descinding order,i will get only a,b,c,p,q,r
but actually what i need is
a,b,c
p,q,r