Converting multiple columns to single row

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
anallamothu
Participant
Posts: 5
Joined: Wed Nov 30, 2011 3:37 pm

Converting multiple columns to single row

Post by anallamothu »

Hello All,

I am developing a parallel job, in which I need to transform multiple columns to a single row.

My source file looks like this

Col1 Col2 Col3 Col4
1 A NULL NULL
2 NULL B NULL
3 NULL NULL C
4 NULL D NULL
5 E NULL NULL
6 NULL NULL F


I should transform that source file to the following Traget file type

Col1 Col2 Col3 Col4
1 A B C
2 E D F


I tried using vertical pivot stage and also the loopvariable in transformer. But it doesn't succeded. :oops:

I hope some of you have idea about this one :)

--
Thanks,
Andy
Kryt0n
Participant
Posts: 584
Joined: Wed Jun 22, 2005 7:28 pm

Post by Kryt0n »

Effectively you are just aggregating the data but what decides which rows merge in to one?

If purely based on order, make sure they are in the right order... then have a generated column that is common for every 3 rows, then aggregate (e.g. putting nulls first and keeping the max for each column kind of thing)

To try make it clearer
Col1 GenCol Col2 Col3 Col4
1 1 A NULL NULL
2 1 NULL B NULL
3 1 NULL NULL C
4 2 NULL D NULL
5 2 E NULL NULL
6 2 NULL NULL F
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

I agree it's not clear what the rules are. If we assume groups of 3 records at a time, it can probably be done using stage variables alone.
Choose a job you love, and you will never have to work a day in your life. - Confucius
kandyshandy
Participant
Posts: 597
Joined: Fri Apr 29, 2005 6:19 am
Location: Singapore

Post by kandyshandy »

such a non-realistic requirement?
Kandy
_________________
Try and Try again…You will succeed atlast!!
anallamothu
Participant
Posts: 5
Joined: Wed Nov 30, 2011 3:37 pm

Post by anallamothu »

sorry for not making it clear in my previous post.I made some modifications to the soruce data in order to make it clear

here the col1 data (i.e. 1,2,3..6) is nothing but an incremental sequence and the col2,col3,col4 data are the one's which I need to combine in a single record.

As qt_ky mentioned yes I need to process in groups of 3 records at a time.

And "NULL" describes there isn't any value present in those corresponding columns.


Col1 Col2 Col3 Col4
1 A NULL NULL
2 NULL B NULL
3 NULL NULL C
4 NULL NULL D
5 NULL NULL E
6 F NULL NULL

And the Tagert file is given below

Col1 Col2 Col3 Col4
1 A B C
2 D E F

here is the tricky part, sometimes the source data can be in this format also

Col1 Col2 Col3 Col4
1 NULL NULL A
2 NULL NULL B
3 NULL NULL C
4 NULL NULL D
5 NULL NULL E
6 NULL NULL F

still I need the output to be in the following format only

Col1 Col2 Col3 Col4
1 A B C
2 D E F



Please disregard my previous post.

--
Thanks,
Andy
anallamothu
Participant
Posts: 5
Joined: Wed Nov 30, 2011 3:37 pm

Post by anallamothu »

qt_ky wrote:I agree it's not clear what the rules are. If we assume groups of 3 records at a time, it can probably be done using stage variables alone.
Hi ,

Can you give me a little knowledge of how it can be done using stage variables?

--
Thanks,
Andy
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

The clarification helps. A stage variable-only solution is possible here, but it may be considered as "quick and dirty." Something more flexible may involve doing a pivot as a second step. If you get interested in that, search this forum on 'pivot.' We can discuss stage vars for purpose of learning.

Your Col1 of a sequential row ID is easily generated in a Transformer stage using system variable @INROWNUM. To start out, run your stage sequentially or run the parallel job on a single node config file.

Insert a new stage variable (suppose you name it svRow123) with derivation:

Code: Select all

Mod(@INROWNUM - 1, 3) + 1
Put that stage var into a new output column and view the result. Without any constraint, you should get a repeating pattern to work with in your output rows of 1,2,3,1,2,3,... Run it at this point and check.

Append another stage variable with derivation:

Code: Select all

NullToEmpty(lnk.Col2) : NullToEmpty(lnk.Col3) : NullToEmpty(lnk.Col4)
Drag it into a new output column, run it, and view the result. It should give rows like A,B,C,D,E,F,...

Add an output link constraint to only output every 3rd row:

Code: Select all

svRow123 = 3
That is only a start. Take it one step at a time... To complete it in this way will require more stage variables and more understanding of them. I left out that gap for now. Try it and see what ideas and questions you may get.
Choose a job you love, and you will never have to work a day in your life. - Confucius
anallamothu
Participant
Posts: 5
Joined: Wed Nov 30, 2011 3:37 pm

Post by anallamothu »

Hi qt_ky,

The issue is resolved :)

I did followed u r approach by concatinating those 3 columns and later I used the pivot stage(vertical pivot).



--
Thanks,
Andy
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The second person possessive pronoun in English is spelled "your". It is not "u r".

DSXchange is not a mobile phone; there is no requirement to use SMS-style abbreviation. Strive, instead, for a professional standard of written English. Among other things, this makes things easier for those whose first language is not English.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
anallamothu
Participant
Posts: 5
Joined: Wed Nov 30, 2011 3:37 pm

Post by anallamothu »

ray.wurlod wrote:The second person possessive pronoun in English is spelled "your". It is not "u r".

DSXchange is not a mobile phone; there is no requirement to use SMS-style abbreviation. Strive, instead, for a professional standard of written English. Among other things, this makes things easier for those whose first language is not English.
Thanks for pointing it out.

--
Andy
Post Reply