Page 1 of 1

Parameterize Pivot logic

Posted: Wed Dec 10, 2008 2:33 am
by ds_search2008
There is a scenario where the row values

Col1 | col#_1| col_data1| col#_2 | col_data_2| col6| col_data_3
100 | 1 | Z | 2 | Y | 3 | X


Using Pivot stage I'm able to achieve the output

C1 C2 C3 C4
100 1 Z A
100 2 Y A
100 3 X A

There are various inputs where the input columns Col#_1 and Col_data_2 may extend upto 'n' columns. I mean
Col#_1....col#_n
and
col_data_2.....col_data_n

The above logic and output patten is correct, however I need to make this job workable for 'n' number of Col#_1 and col_data_2 column values. With pivot we can hard coding the required columns in Pivot-->properties-->output -->derivation. I need to make this logic fit for more number of Col#_1 and col_data_2 columns for various inputs with single job.

Could you please suggest me some options?

Posted: Wed Dec 10, 2008 3:56 am
by ray.wurlod
Hard coding for each particular N is your only option in parallel job.

Posted: Wed Dec 10, 2008 9:35 am
by ds_search2008
The columns col#_1 and col_data_1 that are defined in the inputs vary in number. If I hard code these columns in pivot stage the job would work for one input but the same would fail for another. Therefore, I'm looking for an option that can make this job suitable for many possible inputs.

Say, for the input file with columns col#1, col#2, col#3 or for the input file with columns Col#1, col#2 etc. the job should work.
Similarly for col_data_1 column...
Can you please share your ideas.

Posted: Wed Dec 10, 2008 10:23 am
by kandyshandy
You can do this in BASIC language and store it as a routine. I don't think you can handle variable columns in DS Server or PX job.

Posted: Wed Dec 10, 2008 1:16 pm
by Aruna Gutti
I am not sure the number of fields you have to handle in your job.

I worked on this kind of job where I have different inputs with different number of fields. The way I handled it is make all the input rows have expected maximum number of columns for all input sources and funneled them into a pivot stage.

Hope this helps,

Aruna.

Posted: Thu Dec 11, 2008 4:20 am
by Sainath.Srinivasan
If it is in seq file (source), you can use awk in the first stage itself to re-organize

Posted: Thu Dec 11, 2008 4:23 am
by Sainath.Srinivasan
If it is in seq file (source), you can use awk in the first stage itself to re-organize

Posted: Fri Dec 12, 2008 9:27 am
by ds_search2008
Thanks

Posted: Fri Dec 12, 2008 9:28 am
by ds_search2008
Thanks