Transpose Columns into rows

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Transpose Columns into rows

Post by karthi_gana »

All,

I have the file like below

Code: Select all

ven_symbol 3        5       6     8      9     10    12     13
ABC            20.5    19.5  4.3   3.2   2.9  1.8   12.5  7.5
DEF             40.5    12.5  4.3   5.2   6.9  7.8   72.5  8.5
I would like to transpose the file as below

Code: Select all

ven_symbol   code   value
ABC              3         20.5
ABC              5         19.5
ABC              6           4.3
ABC              8           3.2
ABC              9           2.9
ABC              10         1.8
ABC              12        12.5
ABC              13         7.5
DEF               3         40.5
DEF               5         12.5
DEF               6           4.3
DEF               8           5.2
DEF               9           6.9
DEF               10         7.8
DEF               12        72.5
DEF               13          8.5
Karthik
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Re: Transpose Columns into rows

Post by karthi_gana »

I did some search here and found that i have to use 'PIVOT' stage to accomplish this requirement.

But i splitted the output like below

ven_symbol value
ABC 20.5
ABC 19.5
..
..
..


But how should i include the column name like below?

ven_symbol code value
ABC 3 20.5
ABC 5 19.5
..
..
..
Karthik
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You forgot to include the code column in your Pivot stage as a pivoting column.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Post by karthi_gana »

ray.wurlod wrote:You forgot to include the code column in your Pivot stage as a pivoting column.
???

I have given the columns as below in the output section of PIVOT.

Code: Select all

columnname             derivation
ven_symbol               
value                       e3,e5,e6,e8,e9,e10,e12,e13
if i include 'code' in the output, what should i use in the derivation?
Karthik
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Post by karthi_gana »

Code: Select all

Payments Example
CUSTID LNAME JAN_Sales FEB_Sales MARCH_Sales JUN_Pay DEC_Pay
100 Smith $1,234.00 $1,456.00 $1,578.00 $6,298.00 $7,050.00
101 Yamada $1,245.00 $1,765.00 $1,934.00 $7,290.00 $7,975.00

Output Data in Target Rows After Pivot

Code: Select all

CUSTID LNAME Sales Payments
100 Smith $1,234.00 $6,298.00
100 Smith $1,456.00 $7,050.00
100 Smith $1,578.00 null
101 Yamada $1,245.00 $7,290.00
101 Yamada $1,765.00 $7,975.00
101 Yamada $1,934.00 null
This example has been taken from 'pivot.pdf'.

My requirement is , i need

JAN_Sales FEB_Sales MARCH_Sales

in the output like below

Code: Select all

CUSTID LNAME Month Sales Payments
100 Smith JAN_Sales $1,234.00 $6,298.00
100 Smith FEB_Sales  $1,456.00 $7,050.00
100 Smith MARCH_Sales  $1,578.00 null
101 Yamada JAN_Sales $1,245.00 $7,290.00
101 Yamada FEB_Sales  $1,765.00 $7,975.00
101 Yamada MARCH_Sales $1,934.00 null
Karthik
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Will the input columns be static ?

If yes, then pass the rows as separate links from a transformer and collect using link collector.

If it is variable, store the first row in a stage variable and pass it to pivot stage as additional column to pivot.
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Post by karthi_gana »

Sainath.Srinivasan wrote:Will the input columns be static ?
yes.

But i am not getting your point. can you please explain it?
Karthik
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Essentially you have to include the numeric column headings' values as the sources for the pivot column code - given that they are constant, you can use a Transformer stage to generate the values.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Post by karthi_gana »

ray.wurlod wrote:given that they are constant, you can use a Transformer stage to generate the values.
how should i generate the values? I don't know the way to do it in datastage. i mean the methods i follow. can you give me the sample job design here?
Karthik
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You add them yourself, manually.
-craig

"You can never have too many knives" -- Logan Nine Fingers
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Post by karthi_gana »

chulett wrote:You add them yourself, manually.
can you give me the sample job design here?
Karthik
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Add four new columns to the transformer, manually populate them with your heading names ("JAN_Sales", etc) and pivot them along with their actual, matched data.
-craig

"You can never have too many knives" -- Logan Nine Fingers
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Post by karthi_gana »

chulett wrote:Add four new columns to the transformer, manually populate them with your heading names ("JAN_Sales", etc) and pivot them along with their actual, matched data.
yes. it is working perfectly :)

Thanks Craig and Ray :)

How about the performance of this job? If i receive around 500,000 records every day, will it leads to performance issue?

I have referred some old projects and they used 'ROUTINE' to do 'Row to Column' transpose. Initially, I have been asked to modify the existing routine to do 'Column to Row' transpose. But as i am not that much strong in writing routines,I looked for some work around to do this. I did some search here and got the solution with the help of my datastage mentors(Ray,Craig,Sainath,Arnd etc.,).

But i would like to know which one will be good when we think about performance?

1) PIVOT method
2) ROUTINE
Last edited by karthi_gana on Fri Jul 30, 2010 7:39 am, edited 1 time in total.
Karthik
arunpramanik
Participant
Posts: 63
Joined: Fri Jun 22, 2007 7:27 am
Location: Kolkata

Post by arunpramanik »

Here is another way to do -

Consider your file contains Column Header.

read the file
seperate into two streams
Stream1 will have column header (@inrownum =1)
Stream2 will have data (@inrownum >1)

for Stream1 -
pivot the cloumn header
custid key
lname = lname
mth = jan_sales, ...........,dec_pay
assign a new col rwnm =@outrownum
store it to hash file with rwnm (as key)

hash file will store the following
rwnm , mth
1, jan_sales
....
5, dec_pay

for Stream2 -
do the pivot similarly
assign a new column rwnm
generate rwnm using stage variables
store the data in seq file

seq file will look like -
custid, lname, rwnm, amt
100,"abc", 1, 100
....
100,"abc",5, 500
101,"def",1, 1000
....
101."def",5,5000

now join seqfile and hash file with seqfile.rwnum =hashfile.rwnum
and get the desired output
Post Reply