Reverse Pivot

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
UPS
Premium Member
Premium Member
Posts: 56
Joined: Tue Oct 10, 2006 12:18 pm
Location: New Jersey

Reverse Pivot

Post by UPS »

Newbie question :?: - Can anybody tell me how to accomplish what the PIVOT stage does but in reverse.
Input:
rec1: key1, fld1
rec2: key1, fld1
rec3: key1, fld1
rec4: key2, fld1
rec5: key2, fld1
...
OutPut:
rec1: key1,rec1.fld1,rec2.fld1,rec3.fld1
rec2: key2,rec4.fld1,rec5.fld1,,
...

The process should be able to handle some decision making like what output bucket to populate on certain criteria.

Any help, in detail, would be VERY appreciated
Larry.Griffith
Premium Member
Premium Member
Posts: 7
Joined: Wed Nov 02, 2005 12:19 pm

Post by Larry.Griffith »

The databases I work with are deep not wide, so I do this a lot.

Doing this in a parallel manner with a generic routine is difficult. I have done this with a C routine plug in. The trick is to get all the columns for the same record into the same partition. I do this by hash/sort partitioning on the key field (and only the key field) into the plugin.

Some of the things you need to know when doing this are

- keyid
- output order (unless you have a row for every column and you can garantee they are in the correct order)
- format (TEXT, VALUE, NUMBER, DATE) (If you need to use quotes, or need to format the data )
- column value

Good luck[/list]
Thanks
Larry
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

If your input data is sorted by key and fld1 then you can use the "classical" approach as done in Server jobs. Put in a transform stage and use stage variables to store your interim output columns and last column data data, set the constraint to only output rows when the key-fld1 changes and output the collected list of key and fld1 combinations.
UPS
Premium Member
Premium Member
Posts: 56
Joined: Tue Oct 10, 2006 12:18 pm
Location: New Jersey

Post by UPS »

Hi Arndw
You will know the key changed only when you read the row where the key changed so ideally
rec1: key1, fld1
rec2: key1, fld1
rec3: key1, fld1
rec4: key2, fld1
rec5: key2, fld1

Only when i read in rec4 Key2,fld1 i will know the key has changed so if you want me to write out what i have been holding till then in the stage variables in to the output link it will not work for the last record because you will never know when to write it.

i am trying with a couple of sort stages logic to find the key change the claasical approach. I have an other idea of writing everything out and then using the remove duplicates to retain the last row which will have the most consolidated information.

Thanks
Pavan
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Why dont you think it will work again? Keep apending the values to the stage variable untill a key change is encountered. In that case reset the stage variable and start appending untill the next key change is encountered. Do this till the last record. Then pass the output to remove duplicate stage and retain the last record for each id.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
UPS
Premium Member
Premium Member
Posts: 56
Joined: Tue Oct 10, 2006 12:18 pm
Location: New Jersey

RE: Reverse Pivoting

Post by UPS »

Hi Arndw and DSguru2B,
I understand your concept and I have done something very similar. I am looking for a Parallel solution but I will definitely keep this in mind.
I do not know the other person who posted under UPS. I am the original one who posted as UPS.

Thanks for your help.
John - jalden@ups.com
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

This will work in a parallel job as well. Just sort on the key and hash partition so that all identical keys end up in one partition.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
UPS
Premium Member
Premium Member
Posts: 56
Joined: Tue Oct 10, 2006 12:18 pm
Location: New Jersey

RE: Reverse Pivoting

Post by UPS »

DSguru2B,
What stage would I use then? Would I use a C routine as someone previously suggested?

Regards -
John
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Naa, I would go for a transformer stage, just like the way you do it in a server job.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
UPS
Premium Member
Premium Member
Posts: 56
Joined: Tue Oct 10, 2006 12:18 pm
Location: New Jersey

RE: Reverse Pivoting

Post by UPS »

DSguru2B,
Thanks - I will try this option and keep you appraised.

Thanks again.
John
Post Reply