Dynamic aggregation.

Raghavendra · Post by **Raghavendra** » Wed Aug 29, 2007 3:14 am

We are planning to design a generic component in datastage with the following specifications.

It has to read the metadata of a file at run time.
It will contain different keys for different file definitions.
Then it has to perform the aggregations at runtime based on the file definitions.
The field to which the aggregation has to be applied is also dynamic.

We have planned to design this component first by creating dynamic schema files which should be read in a custom stage.
Then the custom stage has to concatenate all the key fields and create one generic key which will be used for aggregation.

Can somebody throw some light on the following points

assuming that we receive a CSV input file?)
1) How to generate dynamic schema files based on the different input files we receive.
I have gone through parallel developer guide to create a schema file which can be done when we know the format. But we cannot create it at run time by seeing the input file metadata.

2) How can we concatenate only key fields if their position and number is different for each input file?
For example in first case input file may contain 5 columns say A, B, C, D, E out of which C and D can be the key columns.
In second case input file may contain 10 columns M,N,O,P,Q,R,S,T,U,V Out of which M,P,S and U can be key columns.

ray.wurlod · Post by **ray.wurlod** » Wed Aug 29, 2007 3:37 am

How precisely do you plan to "read the metadata of a file"?

Without knowing this it is rather hard to provide further advice.

Raghavendra · Post by **Raghavendra** » Wed Aug 29, 2007 4:00 am

Generally this component has to used for the clients we process the data.
Suppose if we have 5 clients today we will have 5 different file formats and the file formats will increase if the number of clients increases.

The input files can be CSV files with the first row as column names. Can we read this first row and create a schema file for the metadata of the custom stage?

Regarding the key columns we will create a reference file/table to specify the key column names and the custom stage has to read the key columns from the reference and concatenate the fields with that name from the input file.

Can you please let me know how feasible is this solution?

ray.wurlod · Post by **ray.wurlod** » Wed Aug 29, 2007 4:26 pm

Should be do-able. Your custom stage will need to refer to the reference file/table, of course. And you will need to come up with a convention for naming the schema file and getting this into the job(s). Main problem is that, if you need to do any transformation, you must make reference to a specific column/field name.

Raghavendra · Post by **Raghavendra** » Wed Aug 29, 2007 11:52 pm

Ray thanks for your response. I will be sorting out my premier membership soon.

Terala · Post by **Terala** » Fri Aug 14, 2009 6:35 am

Hi Ravindra,

I have also pretty much same kind of requirement you had.
I hope you might build a solution for your requirement, if so can you share how you did ?

Thanks,

chulett · Post by **chulett** » Fri Aug 14, 2009 7:22 am

Terala - best if you start your own post and give us specifics of your requirement, "pretty much the same" doesn't really give people the information they would need to help you.