Dynamically define the column names

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Nagaraj,

It is clear that you want to split a single column into multiple - using a delimiter embedded in that column. That is ok.

The questions are
1.) How do you intend to use the resulting file as the down-stream process will not be able to recognize this changing metadata ?
2.) If first line has 2 embedded fields and next has 10, how will you decide how many columns it must write ?
3.) Where will you be storing the result ? What will be its purpose ?

Answers to these questions will clarify for others.
Nagaraj
Premium Member
Premium Member
Posts: 383
Joined: Thu Nov 08, 2007 12:32 am
Location: Bangalore

Post by Nagaraj »

I have already got the logic working, by defining more columns than required in transformer, And after that i have more complex rules applied....so i am justing keeping those things seperate to avoid confusion.

2. If first line has 2 embedded fields and next has 10, how will you decide how many columns it must write ? >>>>> this is exactly what i shd find out. this is what was my original question. i should be able to loop thru based on the no of values present

Ex if there are 5 values it should be populated in the first five columns and with the remaining columns it shd default which will be removed later.....
next record if there are only two values it shd utilize only the first 2 columns and the rest will be defaulted which will be removed further and so on.....

i hope this is clear now.

Thanks
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

In that case, you can fork you stream and in that stream create two columns.

1.) dummyKey = always constant 1
2.) columnCount = Count(multiValueField, ',') + 1

This is then passed through an aggregator to find the max(columnCount) and referred in your main stream.
Post Reply