Page 1 of 1

Metadata Driven ETL

Posted: Wed May 11, 2016 12:49 am
by sudhakishore
Hi all,

One of the client we are working for is currently using SSIS as their ETL tool.
They currently have a setup where the meta information such as source, table, attribute, data type and target along with the transformation needed is available in a single table.
The table is leveraged so as to build a single ETL that points and does the job ( instead of creating multiple ETLs)

Currently the client wants to move away from SSIS ( and MS SQL) to Datastage (and Oracle).
However they want to leverage the same architecture of using the meta information compiled to create a single workflow to do all ETL jobs.

Can someone give me any pointers if similar implementation of meta data driven ETL implementation is done earlier.

Any help in this regard is very much desirable.

Posted: Wed May 11, 2016 4:29 am
by nikhil_bhasin
You can utilize some of the existing information if not all. For e.g. create a shell script to query the database table and pass parmeters like Source table/Tgt table, source query, insert/update query to datastage job.

Posted: Wed May 11, 2016 8:28 am
by UCDI
You can use runtime column propagation to do parts of it, such as extraction and loading, reusing the E and L pieces of your jobs.

But I am not sure how you could do transforms this way.

there is a way to have a stage call a job by name so you could do it using a job name for the transformation but you would still have to WRITE the transforms either in datastage or in IA rules.

We use a lot of flexible, resuable jobs but not quite to this level. We still operate somewhat on the columns by name.

Posted: Wed May 11, 2016 9:40 am
by chulett
What about the Data Rules stage? Wait, you can't honestly be thinking of migrating from SSIS to DataStage version 8 are you? :?

Posted: Thu May 12, 2016 4:34 am
by sudhakishore
Honestly we are migrating from SSIS to Datastage. Would there be any hiccups if we go so ?? . Datastage rules stage is one Good Idea where we have not tried.

Any other suggestions

Posted: Thu May 12, 2016 4:54 pm
by ray.wurlod
Specifications in the Modify stage can be fully parameterized.

However, the Modify stage has only a subset of the Transformer stage in terms of transformations; in particular it can only (in general) effect transformations that have a single input field.

Posted: Thu May 12, 2016 9:36 pm
by stuartjvnorton
You might be able to create a tool that generates OSH based on your information and then run that through the command line, but how to do that is way beyond my level of knowledge in that area.

Posted: Thu May 12, 2016 9:46 pm
by chulett
Hiccups? You betcha. I don't care what tool you migrate to, ETL or otherwise, there will be issues early on - either with the tool itself or your understanding on how to use it. Especially when you constrain the migration to recreating exactly what you have (and exactly how you do it) in the old tool. I've seen it in the ETL world, the relational database world, the BI world, you name it. It leads to madness and often, failure.

Rather than force the old square peg into your new shiny round hole, focus on building the best round peg you can without hobbling it with this "leverage the same architecture" mandate. IMHO.