Reverse Pivot in Parallel Jobs (DS8) ?

basiltarun · Post by **basiltarun** » Wed Oct 17, 2007 4:26 pm

I have used a Pivot Stage in DS8 Parallel job. After performing some calculations I need to reverse pivot the data back into one row. Is there an easy way to do it?

Example:

Pivot output:
ID1 Val1
ID1 Val2
ID1 Val3
ID2 Val1
ID2 Val2
ID3 Val3

Expected Reverse Pivot output:
ID1 Val1 Val2 Val3
ID2 Val1 Val2 Null
ID3 Null Null Val3

narasimha · Post by **narasimha** » Wed Oct 17, 2007 5:00 pm

Check out the below link

viewtopic.php?t=99149&postdays=0&postor ... ot&start=0

Hoping DS 8 works similar to DS 7.x

basiltarun · Post by **basiltarun** » Wed Oct 17, 2007 7:58 pm

narasimha wrote:Check out the below link

viewtopic.php?t=99149&postdays=0&postor ... ot&start=0

Hoping DS 8 works similar to DS 7.x

I looked thru the link you gave... nothing in there really explains how to do it in parallel jobs. I have done it in Server Jobs using Hash files or Stage Variables.

I tried to sort the rows by the key column and then read the rows and started putting the values in Stage variables. Every row I moved the current value into a previous value stage variable and checked if the current key col is equal to the previous key col. If they are diff then I write all the values in the stage variables. This does not really work in the parallel jobs because I lose the last line in the input. Hope this makes sense.

But I am looking for some ideas to do this in DS8. I cannot use the Datasets as Hash files where I can read and write into it at the same time.

Ramani · Post by **Ramani** » Thu Oct 18, 2007 5:35 am

Pivot output:
ID1 Val1
ID1 Val2
ID1 Val3
ID2 Val1
ID2 Val2
ID3 Val3

Use Transformer Stage and Remove Duplicate Stage.

In Transformer Stage, Use Stage Variable for Val1, Val2 & Val3 and Also ID(the key).
When you send the first row to transformer, you will have stage variables having the corresponding values like below
stg_Val1 = Val1, stg_Val2 = null, stg_Val3 = null and stg_ID = ID1
For 2nd Row
stg_Val1 = Val1, stg_Val2 = Val2, stg_Val3 = null and stg_ID = ID1
For 3rd Row
stg_Val1 = Val1, stg_Val2 = Val2, stg_Val3 = Val3 and stg_ID = ID1

For each row, you assign the Stage Variables to the output, so 3 rows as output would have been sent, and the next stage will hold the data like below.
col1 Col2 Col3 Col4
ID1 Val1 Null Null
ID1 Val1 Val2 Null
ID1 Val1 Val2 Val3

When you get the next key(ID), reset the Stage Variables. So for ID2 you would have output like below.
col1 Col2 Col3 Col4
ID2 Val1 Null Null
ID2 Val1 Val2 Null

so for the next key(ID), reset the Stage Variables again. So for ID3 you would have output like below.
col1 Col2 Col3 Col4
ID3 Val1 Null Null

Send all of these output to the Remove Duplicate stage, and you the same key as above and retain the last row for each keys.
So you would get output like below.
ID1 Val1 Val2 Val3
ID2 Val1 Val2 Null
ID3 Val1 Null Null

Hope this helps. This is the way I do, there should be another way of it. Please share if someone finds it.

Govindarajan · Post by **Govindarajan** » Tue Oct 23, 2007 5:08 am

Reverse PIVOT in datastage is too slow and best way to handle is

a) Push the dataset or data into a physical table
b) write select statement from the table using CASE statement
c) push the result into target

chulett · Post by **chulett** » Tue Oct 23, 2007 6:29 am

Perhaps you can take some comfort from the fact that, from what I recall, the 8.1 release will have a stage (stages?) that will support both pivot 'directions'.

ray.wurlod · Post by **ray.wurlod** » Tue Oct 23, 2007 7:24 am

You remember correctly, but there is a caveat that none of the "go forward" plans is guaranteed to happen. Yes, it's on plan for version 8.1.

basiltarun · Post by **basiltarun** » Tue Oct 23, 2007 9:00 am

ray.wurlod wrote:You remember correctly, but there is a caveat that none of the "go forward" plans is guaranteed to happen. Yes, it's on plan for version 8.1.

Thanks all for your suggestions.

I did not have the liberty to use a temporary table, but was able to use the Stage Variables to control the data using a PrevVal and CurrVal stage variable for all the Columns. The logic is unnecessarily complicated, but atleast performance wise it has not been too bad.

I am trying to use the Remove Duplicate method suggested by Ramani, it sounds good.

Hope DS comes up with a stage specifically for this... would definitely kill al lot of unnecessary posts

DSXchange

Reverse Pivot in Parallel Jobs (DS8) ?

Reverse Pivot in Parallel Jobs (DS8) ?

Re: Reverse Pivot in Parallel Jobs (DS8) ?