Separating Extract, Transform and Load to three or more jobs

olgc · Post by **olgc** » Mon Jul 25, 2005 11:53 am

Hi guys,

There is a practice in ETL coding:

Separating Extract, Transformation and Load to three or more jobs, in order to in case of failure of some step, the previous steps can be reused. That sounds good in theory. Could any one tell where is the practice documented?

Have a nice summer day,

ray.wurlod · Post by **ray.wurlod** » Mon Jul 25, 2005 9:43 pm

Training class DataStage Best Practices for one.

Freezing in an Australian winter (32C in Darwin!)

vmcburney · Post by **vmcburney** » Tue Jul 26, 2005 12:54 am

It is well documented in the archives of this site. Do a search for failover or recovery or banding to find threads on the subject.

kumar_s · Post by **kumar_s** » Fri Aug 05, 2005 12:15 pm

HI,
May be the function of calling the jobs using scritps can be verticalized.

Like you can try segregating the Extraction through a seperate script and transfromation through separate and like wise. so that the restrat point gets bifurcated.

Provided the tracability matrix should be perfect among jobs.

regards
kumar

DaleK · Post by **DaleK** » Fri Aug 05, 2005 12:53 pm

If I understand the problem, it isn't a Best practice that is the problem, but your tool/method to schedule and run your jobs that is the problem.

We use our Mainframe scheduling tool to run our DataStage jobs. This tool allows us to rerun jobs. I guess I just have it a little easier then some of you.

Either way it sounds like you have one heck of a mess.
Best of luck.

ray.wurlod · Post by **ray.wurlod** » Fri Aug 05, 2005 7:16 pm

Sounds like some more thought needs to go into the design of your control structures and restart points. It's all doable, and gracefully. But it must be designed with care.

olgc · Post by **olgc** » Tue Aug 09, 2005 9:42 am

In order to split ETL to E.T.L, a integrated job can always be built first as ETL. When the job is tested, it can be split to 3, 4, even 20s E.T.L jobs, to make it confirmed to the practice and the maintenance and support more challenge, that's the best practice?

ray.wurlod · Post by **ray.wurlod** » Tue Aug 09, 2005 3:08 pm

We normally advocate four phases.

Extraction to first staging area.

Jobs presupposed by Transformation phase (e.g. loading lookups).

Transformation into second staging area.

Loading from staging area into target.