Dynamic aggregation.

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Raghavendra
Participant
Posts: 147
Joined: Sat Apr 30, 2005 1:23 am
Location: Bangalore,India

Dynamic aggregation.

Post by Raghavendra »

We are planning to design a generic component in datastage with the following specifications.

It has to read the metadata of a file at run time.
It will contain different keys for different file definitions.
Then it has to perform the aggregations at runtime based on the file definitions.
The field to which the aggregation has to be applied is also dynamic.

We have planned to design this component first by creating dynamic schema files which should be read in a custom stage.
Then the custom stage has to concatenate all the key fields and create one generic key which will be used for aggregation.

Can somebody throw some light on the following points :( assuming that we receive a CSV input file?)
1) How to generate dynamic schema files based on the different input files we receive.
I have gone through parallel developer guide to create a schema file which can be done when we know the format. But we cannot create it at run time by seeing the input file metadata.

2) How can we concatenate only key fields if their position and number is different for each input file?
For example in first case input file may contain 5 columns say A, B, C, D, E out of which C and D can be the key columns.
In second case input file may contain 10 columns M,N,O,P,Q,R,S,T,U,V Out of which M,P,S and U can be key columns.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

How precisely do you plan to "read the metadata of a file"?

Without knowing this it is rather hard to provide further advice.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Raghavendra
Participant
Posts: 147
Joined: Sat Apr 30, 2005 1:23 am
Location: Bangalore,India

Post by Raghavendra »

Generally this component has to used for the clients we process the data.
Suppose if we have 5 clients today we will have 5 different file formats and the file formats will increase if the number of clients increases.

The input files can be CSV files with the first row as column names. Can we read this first row and create a schema file for the metadata of the custom stage?

Regarding the key columns we will create a reference file/table to specify the key column names and the custom stage has to read the key columns from the reference and concatenate the fields with that name from the input file.

Can you please let me know how feasible is this solution?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Should be do-able. Your custom stage will need to refer to the reference file/table, of course. And you will need to come up with a convention for naming the schema file and getting this into the job(s). Main problem is that, if you need to do any transformation, you must make reference to a specific column/field name.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Raghavendra
Participant
Posts: 147
Joined: Sat Apr 30, 2005 1:23 am
Location: Bangalore,India

Post by Raghavendra »

Ray thanks for your response. I will be sorting out my premier membership soon.
Terala
Premium Member
Premium Member
Posts: 73
Joined: Wed Apr 06, 2005 3:04 pm

Post by Terala »

Hi Ravindra,

I have also pretty much same kind of requirement you had.
I hope you might build a solution for your requirement, if so can you share how you did ?

Thanks,
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Terala - best if you start your own post and give us specifics of your requirement, "pretty much the same" doesn't really give people the information they would need to help you.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply