Converting multiple columns to single row
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 5
- Joined: Wed Nov 30, 2011 3:37 pm
Converting multiple columns to single row
Hello All,
I am developing a parallel job, in which I need to transform multiple columns to a single row.
My source file looks like this
Col1 Col2 Col3 Col4
1 A NULL NULL
2 NULL B NULL
3 NULL NULL C
4 NULL D NULL
5 E NULL NULL
6 NULL NULL F
I should transform that source file to the following Traget file type
Col1 Col2 Col3 Col4
1 A B C
2 E D F
I tried using vertical pivot stage and also the loopvariable in transformer. But it doesn't succeded.
I hope some of you have idea about this one
--
Thanks,
Andy
I am developing a parallel job, in which I need to transform multiple columns to a single row.
My source file looks like this
Col1 Col2 Col3 Col4
1 A NULL NULL
2 NULL B NULL
3 NULL NULL C
4 NULL D NULL
5 E NULL NULL
6 NULL NULL F
I should transform that source file to the following Traget file type
Col1 Col2 Col3 Col4
1 A B C
2 E D F
I tried using vertical pivot stage and also the loopvariable in transformer. But it doesn't succeded.
I hope some of you have idea about this one
--
Thanks,
Andy
Effectively you are just aggregating the data but what decides which rows merge in to one?
If purely based on order, make sure they are in the right order... then have a generated column that is common for every 3 rows, then aggregate (e.g. putting nulls first and keeping the max for each column kind of thing)
To try make it clearer
Col1 GenCol Col2 Col3 Col4
1 1 A NULL NULL
2 1 NULL B NULL
3 1 NULL NULL C
4 2 NULL D NULL
5 2 E NULL NULL
6 2 NULL NULL F
If purely based on order, make sure they are in the right order... then have a generated column that is common for every 3 rows, then aggregate (e.g. putting nulls first and keeping the max for each column kind of thing)
To try make it clearer
Col1 GenCol Col2 Col3 Col4
1 1 A NULL NULL
2 1 NULL B NULL
3 1 NULL NULL C
4 2 NULL D NULL
5 2 E NULL NULL
6 2 NULL NULL F
-
- Participant
- Posts: 597
- Joined: Fri Apr 29, 2005 6:19 am
- Location: Singapore
-
- Participant
- Posts: 5
- Joined: Wed Nov 30, 2011 3:37 pm
sorry for not making it clear in my previous post.I made some modifications to the soruce data in order to make it clear
here the col1 data (i.e. 1,2,3..6) is nothing but an incremental sequence and the col2,col3,col4 data are the one's which I need to combine in a single record.
As qt_ky mentioned yes I need to process in groups of 3 records at a time.
And "NULL" describes there isn't any value present in those corresponding columns.
Col1 Col2 Col3 Col4
1 A NULL NULL
2 NULL B NULL
3 NULL NULL C
4 NULL NULL D
5 NULL NULL E
6 F NULL NULL
And the Tagert file is given below
Col1 Col2 Col3 Col4
1 A B C
2 D E F
here is the tricky part, sometimes the source data can be in this format also
Col1 Col2 Col3 Col4
1 NULL NULL A
2 NULL NULL B
3 NULL NULL C
4 NULL NULL D
5 NULL NULL E
6 NULL NULL F
still I need the output to be in the following format only
Col1 Col2 Col3 Col4
1 A B C
2 D E F
Please disregard my previous post.
--
Thanks,
Andy
here the col1 data (i.e. 1,2,3..6) is nothing but an incremental sequence and the col2,col3,col4 data are the one's which I need to combine in a single record.
As qt_ky mentioned yes I need to process in groups of 3 records at a time.
And "NULL" describes there isn't any value present in those corresponding columns.
Col1 Col2 Col3 Col4
1 A NULL NULL
2 NULL B NULL
3 NULL NULL C
4 NULL NULL D
5 NULL NULL E
6 F NULL NULL
And the Tagert file is given below
Col1 Col2 Col3 Col4
1 A B C
2 D E F
here is the tricky part, sometimes the source data can be in this format also
Col1 Col2 Col3 Col4
1 NULL NULL A
2 NULL NULL B
3 NULL NULL C
4 NULL NULL D
5 NULL NULL E
6 NULL NULL F
still I need the output to be in the following format only
Col1 Col2 Col3 Col4
1 A B C
2 D E F
Please disregard my previous post.
--
Thanks,
Andy
-
- Participant
- Posts: 5
- Joined: Wed Nov 30, 2011 3:37 pm
The clarification helps. A stage variable-only solution is possible here, but it may be considered as "quick and dirty." Something more flexible may involve doing a pivot as a second step. If you get interested in that, search this forum on 'pivot.' We can discuss stage vars for purpose of learning.
Your Col1 of a sequential row ID is easily generated in a Transformer stage using system variable @INROWNUM. To start out, run your stage sequentially or run the parallel job on a single node config file.
Insert a new stage variable (suppose you name it svRow123) with derivation:
Put that stage var into a new output column and view the result. Without any constraint, you should get a repeating pattern to work with in your output rows of 1,2,3,1,2,3,... Run it at this point and check.
Append another stage variable with derivation:
Drag it into a new output column, run it, and view the result. It should give rows like A,B,C,D,E,F,...
Add an output link constraint to only output every 3rd row:
That is only a start. Take it one step at a time... To complete it in this way will require more stage variables and more understanding of them. I left out that gap for now. Try it and see what ideas and questions you may get.
Your Col1 of a sequential row ID is easily generated in a Transformer stage using system variable @INROWNUM. To start out, run your stage sequentially or run the parallel job on a single node config file.
Insert a new stage variable (suppose you name it svRow123) with derivation:
Code: Select all
Mod(@INROWNUM - 1, 3) + 1
Append another stage variable with derivation:
Code: Select all
NullToEmpty(lnk.Col2) : NullToEmpty(lnk.Col3) : NullToEmpty(lnk.Col4)
Add an output link constraint to only output every 3rd row:
Code: Select all
svRow123 = 3
Choose a job you love, and you will never have to work a day in your life. - Confucius
-
- Participant
- Posts: 5
- Joined: Wed Nov 30, 2011 3:37 pm
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
The second person possessive pronoun in English is spelled "your". It is not "u r".
DSXchange is not a mobile phone; there is no requirement to use SMS-style abbreviation. Strive, instead, for a professional standard of written English. Among other things, this makes things easier for those whose first language is not English.
DSXchange is not a mobile phone; there is no requirement to use SMS-style abbreviation. Strive, instead, for a professional standard of written English. Among other things, this makes things easier for those whose first language is not English.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 5
- Joined: Wed Nov 30, 2011 3:37 pm
Thanks for pointing it out.ray.wurlod wrote:The second person possessive pronoun in English is spelled "your". It is not "u r".
DSXchange is not a mobile phone; there is no requirement to use SMS-style abbreviation. Strive, instead, for a professional standard of written English. Among other things, this makes things easier for those whose first language is not English.
--
Andy