Pivoting Dynamic number of columns

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dsuser7
Participant
Posts: 41
Joined: Sat Nov 29, 2008 2:21 am

Pivoting Dynamic number of columns

Post by dsuser7 »

Hi,

I'm trying to obtain the below output. The below input sample comes in a csv file.
I'm using V8.1


Input

col1,col2,col3,col4,col3,col4
a,b,x1,y1,x2,y2
q,w,r,s

Output
a,b,x1,y1
a,b,x2,y2
q,w,r,s

This is a case of pivoting but the set of col3,col4 may repeat any number of times. The column names col3 and col4 repeat with same name.

If pivot stage is to be used by defining maximum number of columns (which will be a guess), the Sequential stage used to read the .csv file would fail to read if the number of maximum columns are not present in other rows.

Any ideas on how to achieve this.
Thanks.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

I think similar requirements have been discussed here just recently, but:

Read the record as a single varchar column
(8.5) Use a transformer with looping to build the pivoted output records
(pre-8.5) Use a transfomer with multiple output links to build the pivoted records and funnel them together, or use a buildop/custom operator

Any version: Use awk/perl/C/C++ with an external source stage/SeqFile filter/beforeJob ExecSH to do the pivoting outside of DS.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
dsuser7
Participant
Posts: 41
Joined: Sat Nov 29, 2008 2:21 am

Post by dsuser7 »

jwiles wrote:I think similar requirements have been discussed here just recently, but:

Read the record as a single varchar column
(8.5) Use a transformer with looping to build the pivoted output records
(pre-8.5) Use a transfomer with multiple output links to build the pivoted records and funnel them together, or use a buildop/custom operator

Any version: Use awk/perl/C/C++ with an external source stage/SeqFile filter/beforeJob ExecSH to do the pivoting outside of DS.

Regards,
Thanks for the reply JWiles.
I have tried using BuildOps but couldn't really go too far as I didn't knew what needs to be put in the PreLoop,Per-record and PostLoop.
I have read the material that comes with product but really couldn't figure out how to do it.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

The BuildOp interface essentially presents a framework/IDE for coding C++ logic within DS Designer. The three sections--Pre-Loop, Per-Record and Post-Loop--would contain logic to be performed during job initialization, processing records and end-of-job respectively.

Too much more to go into than is appropriate for the forum, tho.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Attend the IBM Advanced DataStage class. You will learn about, and do, construction of a working Build stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

This is more easily accomplished with a server job, in which the Sequential File stage has a "missing column" rule.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply