Yet another Pivot Question (bit different though)

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
fridge
Premium Member
Premium Member
Posts: 136
Joined: Sat Jan 10, 2004 8:51 am

Yet another Pivot Question (bit different though)

Post by fridge »

hi,

I Have an input file containing a single pipe-delimited column and I want to be able to convert the pipe delimited 'subcolumns' into 'real' columns then pivot on them - the catch is I dont know before hand how many of the subcolumns there are

e.g.

source

Key DelimCol
A 1|4|5
B 7|8
C 9

target
A 1
A 4
A 5
B 7
B 8
C 9

As mentioned I cant be sure how many of the 'subcolumns' there are going to be

There are only a few thousand source records so performance not an issue

I believe the answer may be in the restructure staqges in PX (solution can be server or PX) but not to sure of best way forward

Thanks in advance
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Do you know the maximum number of 'delimited subcolumns' you can see? If so, then it's still just a regular pivot with something downstream of it to block the 'empty' subcolumns.
-craig

"You can never have too many knives" -- Logan Nine Fingers
fridge
Premium Member
Premium Member
Posts: 136
Joined: Sat Jan 10, 2004 8:51 am

Post by fridge »

I guess I do -could set the max to some thing big (e.g. 50) but didnt want the faff of it - wanted a more generic solution
ameyvaidya
Charter Member
Charter Member
Posts: 166
Joined: Wed Mar 16, 2005 6:52 am
Location: Mumbai, India

Post by ameyvaidya »

Server solution that will scale to any number of sub-columns:

Code: Select all

Source----->Transformer----->SeqFile----->Output
IN the transformer, have 2 output columns
Firstkey<--- In.key
Data---< ereplace(in.DelimCol,"|",char(10):in.key:",")

In the seqfile input link, set the following format:
Delimiter=","
Quote Char=000

IN the SeqFile Output, creater 2 columns:

Key,
Col

With the following format:
Delimiter=","
Quote Char=000

This has been tested to work.
I do not have a corresponding PX solution
IHTH
Amey Vaidya<i>
I am rarely happier than when spending an entire day programming my computer to perform automatically a task that it would otherwise take me a good ten seconds to do by hand.</i>
<i>- Douglas Adams</i>
fridge
Premium Member
Premium Member
Posts: 136
Joined: Sat Jan 10, 2004 8:51 am

Post by fridge »

Cool - that worked a treat - I knew I could do something clever but it was a long day :-)

I actually ended up writing an Awk program to do it - but this is a lot more suitable as I prefer to keep the logic visible as it were.

Many thanks ameyvidya

Ed
Post Reply