How to convert columns into Rows with metadata?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
pkll
Participant
Posts: 73
Joined: Thu Oct 25, 2012 9:45 pm

How to convert columns into Rows with metadata?

Post by pkll »

Hi,

I have One Scenario.

EMPNO|ENAME|SALARY
101|AAA|10000
102|BBB|20000
103|CCC|30000


I want Output

EMPNO|101|102|103
EMPNAME|AAA|BBB|CCC
SALARY|10000|20000|30000


I tried but I didn't get desired output. I create one dummy key column in transformer datastage and hardcoded one value. In sort stage based on dummy key column I selected create keychange column = TRUE.
Next by using the transformer stage I checked duplicated value or not if duplicated value I concatenate the value and remove duplicate stage I had taken duplicate retain = Last.

Finally I got output is like

EMPNO ENAME SAL
-------------- ------------------ --------------------------
101|102|103 AAA|BBB|CCC 10000|20000|30000

But I want Output

EMPNO|101|102|103
EMPNAME|AAA|BBB|CCC
SALARY|10000|20000|30000

Please help me ?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

So, which part isn't working - the sort, the transformer or the remove duplicates? Seems to me you'll need to hardcode all three values, pivot then the concatenation and finally remove duplicates.
-craig

"You can never have too many knives" -- Logan Nine Fingers
pkll
Participant
Posts: 73
Joined: Thu Oct 25, 2012 9:45 pm

Post by pkll »

Hi Chullet,

in the above example I don't have any key column so I had created one dummy column and assign value '1'. I had taken that value as key column.
Remaining all I applied logic.

but I am getting output as


EMPNO ENAME SAL
-------------- ------------------ --------------------------
101|102|103 AAA|BBB|CCC 10000|20000|30000


But I Want output

EMPNO|101|102|103
EMPNAME|AAA|BBB|CCC
SALARY|10000|20000|30000
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

That dummy column buys you nothing. Instead, hardcode the three column names and then pivot. Meaning, turn this input:

EMPNO|ENAME|SALARY
101|AAA|10000
102|BBB|20000
103|CCC|30000

Into this:
EMPNO|ENAME|SALARY|EMPNO_NM|ENAME_NM|SALARY_NM
101|AAA|10000|EMPNO|ENAME|SALARY
102|BBB|20000|EMPNO|ENAME|SALARY
103|CCC|30000|EMPNO|ENAME|SALARY

Then use the last three columns as the Pivot key, matched up with their corresponding values:

Key: EMPNO_NM|ENAME_NM|SALARY_NM
Value: EMPNO|ENAME|SALARY

What comes out of the pivot looks like:
EMPNO|101
EMPNO|102
EMPNO|103
ENAME|AAA
ENAME|BBB
ENAME|CCC
SALARY|10000
SALARY|20000
SALARY|30000

Now you've got what you need for the concatenation to work.

ps. This should be pretty darn close but is all off the top of my head with no testing of any kind.
-craig

"You can never have too many knives" -- Logan Nine Fingers
pkll
Participant
Posts: 73
Joined: Thu Oct 25, 2012 9:45 pm

Post by pkll »

Hi Chulett,

I have not three columns. I have around 50 columns .I can't create 50 dummy columns.

EMPNO
-------
101
102
103

Required output is

EMPNO|101|102|103

but I am getting output as
EMPNO
101|102|103

So how can I get required output means I need to concatenate with metadata also.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Awesome. Another "example" with made up data that doesn't match what really needs to be done. Would it have killed you to have mentioned that in the original post? And repeating everything with just one column as the example doesn't help nor change anything, my answer is still my answer. :?

First explain to me why you "can't create 50 dummy columns". Can't or just don't really want to?
-craig

"You can never have too many knives" -- Logan Nine Fingers
pkll
Participant
Posts: 73
Joined: Thu Oct 25, 2012 9:45 pm

Post by pkll »

If I will create 50 Dummy columns performance will decrease more over I have more than 2 million records is there. Above I had taken one column instead of three columns .if 1 column output will come means I can apply same logic remaining columns.
any how
based on my output how can I get required output

my output is
EMPNO
----------
101|102|103

Required output is

EMPNO|101|102|103
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Performance will decrease? So? Isn't the goal to get a working solution? I've already posted how you can get your desired output, 1 column or 50. And I really don't see the addition of a number of columns as any kind of performance killer. And please stop repeating your example. Repetition != helping.

Look - there's no automagic way to get the metadata in the job as data unless you provide it. I don't really see any way around the need for additional columns and I've given you the traditional answer where they are hardcoded into those additional columns. Another option you could look into to provide that would be to put the column names as data in another source - flat file, database table - and then merge the two inputs. That could be a cross join or it could be a merge depending on how you setup the column name values. I'll leave you to work out the gory details.

And by that I mean chose an approach, give it a shot and see what works and what doesn't and then come back with those issues.
-craig

"You can never have too many knives" -- Logan Nine Fingers
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

:?: Question for the masses - will the Pivot stage support listing ~100 column names in the fields in question? Or does it not really care how many columns are listed as long as they all fit within that property / string?
Last edited by chulett on Wed Jun 11, 2014 9:09 pm, edited 1 time in total.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

pkll wrote:If I will create 50 Dummy columns performance will decrease
Prove it.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
pkll
Participant
Posts: 73
Joined: Thu Oct 25, 2012 9:45 pm

Post by pkll »

Hi Chulett,

As per your suggestion I got the required output. If I tortured you extremely sorry. As per our conversation I am learning something.
Thanks for it.

I have one more doubt, Now I am thinking as reverse order

suppose if my input is

EMPNO|101|102|103
EMPNAME|AAA|BBB|CCC
SALARY|10000|20000|30000

shell I get output as

EMPNO|ENAME|SALARY
101|AAA|10000
102|BBB|20000
103|CCC|30000

is it possible? if possible please suggest me?

Thanks for Advance!!!
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

pkll wrote:If I tortured you extremely sorry. As per our conversation I am learning something. Thanks for it.
No worries... and good. :wink:

I'll have to ponder the reverse though it is 'just another pivot'... and there's no reason for me to go this alone, other suggestions welcome!
-craig

"You can never have too many knives" -- Logan Nine Fingers
pkll
Participant
Posts: 73
Joined: Thu Oct 25, 2012 9:45 pm

Post by pkll »

Hi Chulett,

Can you please explain detail level how can I do by using pivot stage?

Thanks for Advance!!
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I don't have much if any spare time at the moment. Hopefully you've been working on this, pondering / testing / something and not simply waiting for me... right? :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply