Page 1 of 1

Reverse Pivot in Parallel Jobs (DS8) ?

Posted: Wed Oct 17, 2007 4:26 pm
by basiltarun
I have used a Pivot Stage in DS8 Parallel job. After performing some calculations I need to reverse pivot the data back into one row. Is there an easy way to do it?

Example:

Pivot output:
ID1 Val1
ID1 Val2
ID1 Val3
ID2 Val1
ID2 Val2
ID3 Val3

Expected Reverse Pivot output:
ID1 Val1 Val2 Val3
ID2 Val1 Val2 Null
ID3 Null Null Val3

Posted: Wed Oct 17, 2007 5:00 pm
by narasimha
Check out the below link

viewtopic.php?t=99149&postdays=0&postor ... ot&start=0

Hoping DS 8 works similar to DS 7.x

Posted: Wed Oct 17, 2007 7:58 pm
by basiltarun
narasimha wrote:Check out the below link

viewtopic.php?t=99149&postdays=0&postor ... ot&start=0

Hoping DS 8 works similar to DS 7.x

I looked thru the link you gave... nothing in there really explains how to do it in parallel jobs. I have done it in Server Jobs using Hash files or Stage Variables.

I tried to sort the rows by the key column and then read the rows and started putting the values in Stage variables. Every row I moved the current value into a previous value stage variable and checked if the current key col is equal to the previous key col. If they are diff then I write all the values in the stage variables. This does not really work in the parallel jobs because I lose the last line in the input. Hope this makes sense.

But I am looking for some ideas to do this in DS8. I cannot use the Datasets as Hash files where I can read and write into it at the same time.

Posted: Thu Oct 18, 2007 5:35 am
by Ramani
Pivot output:
ID1 Val1
ID1 Val2
ID1 Val3
ID2 Val1
ID2 Val2
ID3 Val3

Use Transformer Stage and Remove Duplicate Stage.

In Transformer Stage, Use Stage Variable for Val1, Val2 & Val3 and Also ID(the key).
When you send the first row to transformer, you will have stage variables having the corresponding values like below
stg_Val1 = Val1, stg_Val2 = null, stg_Val3 = null and stg_ID = ID1
For 2nd Row
stg_Val1 = Val1, stg_Val2 = Val2, stg_Val3 = null and stg_ID = ID1
For 3rd Row
stg_Val1 = Val1, stg_Val2 = Val2, stg_Val3 = Val3 and stg_ID = ID1

For each row, you assign the Stage Variables to the output, so 3 rows as output would have been sent, and the next stage will hold the data like below.
col1 Col2 Col3 Col4
ID1 Val1 Null Null
ID1 Val1 Val2 Null
ID1 Val1 Val2 Val3


When you get the next key(ID), reset the Stage Variables. So for ID2 you would have output like below.
col1 Col2 Col3 Col4
ID2 Val1 Null Null
ID2 Val1 Val2 Null

so for the next key(ID), reset the Stage Variables again. So for ID3 you would have output like below.
col1 Col2 Col3 Col4
ID3 Val1 Null Null


Send all of these output to the Remove Duplicate stage, and you the same key as above and retain the last row for each keys.
So you would get output like below.
ID1 Val1 Val2 Val3
ID2 Val1 Val2 Null
ID3 Val1 Null Null

Hope this helps. This is the way I do, there should be another way of it. Please share if someone finds it.

Re: Reverse Pivot in Parallel Jobs (DS8) ?

Posted: Tue Oct 23, 2007 5:08 am
by Govindarajan
Reverse PIVOT in datastage is too slow and best way to handle is

a) Push the dataset or data into a physical table
b) write select statement from the table using CASE statement
c) push the result into target

Posted: Tue Oct 23, 2007 6:29 am
by chulett
Perhaps you can take some comfort from the fact that, from what I recall, the 8.1 release will have a stage (stages?) that will support both pivot 'directions'.

Posted: Tue Oct 23, 2007 7:24 am
by ray.wurlod
You remember correctly, but there is a caveat that none of the "go forward" plans is guaranteed to happen. Yes, it's on plan for version 8.1.

Posted: Tue Oct 23, 2007 9:00 am
by basiltarun
ray.wurlod wrote:You remember correctly, but there is a caveat that none of the "go forward" plans is guaranteed to happen. Yes, it's on plan for version 8.1.

Thanks all for your suggestions.

I did not have the liberty to use a temporary table, but was able to use the Stage Variables to control the data using a PrevVal and CurrVal stage variable for all the Columns. The logic is unnecessarily complicated, but atleast performance wise it has not been too bad.

I am trying to use the Remove Duplicate method suggested by Ramani, it sounds good.

Hope DS comes up with a stage specifically for this... would definitely kill al lot of unnecessary posts :wink: