Parameterize Pivot logic

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ds_search2008
Participant
Posts: 75
Joined: Thu Nov 27, 2008 10:12 am

Parameterize Pivot logic

Post by ds_search2008 »

There is a scenario where the row values

Col1 | col#_1| col_data1| col#_2 | col_data_2| col6| col_data_3
100 | 1 | Z | 2 | Y | 3 | X


Using Pivot stage I'm able to achieve the output

C1 C2 C3 C4
100 1 Z A
100 2 Y A
100 3 X A

There are various inputs where the input columns Col#_1 and Col_data_2 may extend upto 'n' columns. I mean
Col#_1....col#_n
and
col_data_2.....col_data_n

The above logic and output patten is correct, however I need to make this job workable for 'n' number of Col#_1 and col_data_2 column values. With pivot we can hard coding the required columns in Pivot-->properties-->output -->derivation. I need to make this logic fit for more number of Col#_1 and col_data_2 columns for various inputs with single job.

Could you please suggest me some options?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Hard coding for each particular N is your only option in parallel job.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ds_search2008
Participant
Posts: 75
Joined: Thu Nov 27, 2008 10:12 am

Post by ds_search2008 »

The columns col#_1 and col_data_1 that are defined in the inputs vary in number. If I hard code these columns in pivot stage the job would work for one input but the same would fail for another. Therefore, I'm looking for an option that can make this job suitable for many possible inputs.

Say, for the input file with columns col#1, col#2, col#3 or for the input file with columns Col#1, col#2 etc. the job should work.
Similarly for col_data_1 column...
Can you please share your ideas.
kandyshandy
Participant
Posts: 597
Joined: Fri Apr 29, 2005 6:19 am
Location: Singapore

Post by kandyshandy »

You can do this in BASIC language and store it as a routine. I don't think you can handle variable columns in DS Server or PX job.
Kandy
_________________
Try and Try again…You will succeed atlast!!
Aruna Gutti
Premium Member
Premium Member
Posts: 145
Joined: Fri Sep 21, 2007 9:35 am
Location: Boston

Post by Aruna Gutti »

I am not sure the number of fields you have to handle in your job.

I worked on this kind of job where I have different inputs with different number of fields. The way I handled it is make all the input rows have expected maximum number of columns for all input sources and funneled them into a pivot stage.

Hope this helps,

Aruna.
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

If it is in seq file (source), you can use awk in the first stage itself to re-organize
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

If it is in seq file (source), you can use awk in the first stage itself to re-organize
ds_search2008
Participant
Posts: 75
Joined: Thu Nov 27, 2008 10:12 am

Post by ds_search2008 »

Thanks
ds_search2008
Participant
Posts: 75
Joined: Thu Nov 27, 2008 10:12 am

Post by ds_search2008 »

Thanks
Post Reply