Reverse Pivot in Parallel Jobs (DS8) ?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
basiltarun
Premium Member
Premium Member
Posts: 15
Joined: Tue Sep 09, 2003 10:41 pm

Reverse Pivot in Parallel Jobs (DS8) ?

Post by basiltarun »

I have used a Pivot Stage in DS8 Parallel job. After performing some calculations I need to reverse pivot the data back into one row. Is there an easy way to do it?

Example:

Pivot output:
ID1 Val1
ID1 Val2
ID1 Val3
ID2 Val1
ID2 Val2
ID3 Val3

Expected Reverse Pivot output:
ID1 Val1 Val2 Val3
ID2 Val1 Val2 Null
ID3 Null Null Val3
narasimha
Charter Member
Charter Member
Posts: 1236
Joined: Fri Oct 22, 2004 8:59 am
Location: Staten Island, NY

Post by narasimha »

Check out the below link

viewtopic.php?t=99149&postdays=0&postor ... ot&start=0

Hoping DS 8 works similar to DS 7.x
Narasimha Kade

Finding answers is simple, all you need to do is come up with the correct questions.
basiltarun
Premium Member
Premium Member
Posts: 15
Joined: Tue Sep 09, 2003 10:41 pm

Post by basiltarun »

narasimha wrote:Check out the below link

viewtopic.php?t=99149&postdays=0&postor ... ot&start=0

Hoping DS 8 works similar to DS 7.x

I looked thru the link you gave... nothing in there really explains how to do it in parallel jobs. I have done it in Server Jobs using Hash files or Stage Variables.

I tried to sort the rows by the key column and then read the rows and started putting the values in Stage variables. Every row I moved the current value into a previous value stage variable and checked if the current key col is equal to the previous key col. If they are diff then I write all the values in the stage variables. This does not really work in the parallel jobs because I lose the last line in the input. Hope this makes sense.

But I am looking for some ideas to do this in DS8. I cannot use the Datasets as Hash files where I can read and write into it at the same time.
Ramani
Participant
Posts: 58
Joined: Mon Oct 08, 2007 1:51 am

Post by Ramani »

Pivot output:
ID1 Val1
ID1 Val2
ID1 Val3
ID2 Val1
ID2 Val2
ID3 Val3

Use Transformer Stage and Remove Duplicate Stage.

In Transformer Stage, Use Stage Variable for Val1, Val2 & Val3 and Also ID(the key).
When you send the first row to transformer, you will have stage variables having the corresponding values like below
stg_Val1 = Val1, stg_Val2 = null, stg_Val3 = null and stg_ID = ID1
For 2nd Row
stg_Val1 = Val1, stg_Val2 = Val2, stg_Val3 = null and stg_ID = ID1
For 3rd Row
stg_Val1 = Val1, stg_Val2 = Val2, stg_Val3 = Val3 and stg_ID = ID1

For each row, you assign the Stage Variables to the output, so 3 rows as output would have been sent, and the next stage will hold the data like below.
col1 Col2 Col3 Col4
ID1 Val1 Null Null
ID1 Val1 Val2 Null
ID1 Val1 Val2 Val3


When you get the next key(ID), reset the Stage Variables. So for ID2 you would have output like below.
col1 Col2 Col3 Col4
ID2 Val1 Null Null
ID2 Val1 Val2 Null

so for the next key(ID), reset the Stage Variables again. So for ID3 you would have output like below.
col1 Col2 Col3 Col4
ID3 Val1 Null Null


Send all of these output to the Remove Duplicate stage, and you the same key as above and retain the last row for each keys.
So you would get output like below.
ID1 Val1 Val2 Val3
ID2 Val1 Val2 Null
ID3 Val1 Null Null

Hope this helps. This is the way I do, there should be another way of it. Please share if someone finds it.
Govindarajan
Participant
Posts: 24
Joined: Mon Jul 12, 2004 10:16 am

Re: Reverse Pivot in Parallel Jobs (DS8) ?

Post by Govindarajan »

Reverse PIVOT in datastage is too slow and best way to handle is

a) Push the dataset or data into a physical table
b) write select statement from the table using CASE statement
c) push the result into target
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Perhaps you can take some comfort from the fact that, from what I recall, the 8.1 release will have a stage (stages?) that will support both pivot 'directions'.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You remember correctly, but there is a caveat that none of the "go forward" plans is guaranteed to happen. Yes, it's on plan for version 8.1.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
basiltarun
Premium Member
Premium Member
Posts: 15
Joined: Tue Sep 09, 2003 10:41 pm

Post by basiltarun »

ray.wurlod wrote:You remember correctly, but there is a caveat that none of the "go forward" plans is guaranteed to happen. Yes, it's on plan for version 8.1.

Thanks all for your suggestions.

I did not have the liberty to use a temporary table, but was able to use the Stage Variables to control the data using a PrevVal and CurrVal stage variable for all the Columns. The logic is unnecessarily complicated, but atleast performance wise it has not been too bad.

I am trying to use the Remove Duplicate method suggested by Ramani, it sounds good.

Hope DS comes up with a stage specifically for this... would definitely kill al lot of unnecessary posts :wink:
Post Reply