Transpose Data

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
Dsnew
Participant
Posts: 90
Joined: Tue Apr 19, 2005 2:08 pm

Transpose Data

Post by Dsnew »

Hi all,

Can any body guide me in the below scenario

Table A has the below data

Code: Select all

Item1 Item2 Item3 Item_type Item_Qty
----  ----- ----- --------- --------
A	    B 	C	     x	      1
A	    B 	C	     y	      2
A	    B 	C	     z	      5
D	    E	 F	     y	      2
D	    E	 F	     x	      6
D	    E    F	     u	      8
D	    E	 F	     m         9

I want to transpose the Item_Type and Quantity in the below manner

Code: Select all

Item1 Item2 Item3 Item_type_x Item_type_y Item_type_z Item_type_u Item_type_m
----- ----- ----- ----------- ----------- ----------- ----------- -----------
A	   B 	   C	   1	      2		     5	       0		     0
D	   E	    F	   6	      2		     0	       8		     9
Note: The number of Item_types is known (5 in my case)
Let me know if more information is needed

Any pointers
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

It's called a horizontal pivot, you can search the forum for more discussions, the Pivot stage only does a vertical pivot.

The easiest design to understand, in my opinion, involves a Transformer and an Aggregator. Send your current row w/ columns as supplied into a Transformer, but setup output metadata exactly like you have shown for your expected results.

Map the source column into the all of the like target columns, but use an If-Then-Else statement in the derivation. If the ItemType = "x" Then source value else NULL. Do this for all of the target columns.

Send the output link into an Aggregator stage. Group by the key columns, and use MAX() derivation function on the other columns. This is compress multiple input rows together. If there were 5 source rows, then 5 rows will go into the Aggregator, but one row will come out. Since 4/5 times each column is NULL, and 1/5 times has value, the MAX() will give you the value for each column no matter which input row supplied it. This is the easiest horizontal pivot to do.

Avoid solutions with stage variables, they enforce row sorting, as well as still have an issue with the last group having a row indicator to signal it's the last row. You can always sort the data to accelerate the Aggregator if your volume requires.

By the way, this solution works the same in Server or Parallel.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Dsnew
Participant
Posts: 90
Joined: Tue Apr 19, 2005 2:08 pm

Post by Dsnew »

Bulls eye!
Problem solved. Thanks Ken
Post Reply