Page 1 of 1

Separating Extract, Transform and Load to three or more jobs

Posted: Mon Jul 25, 2005 11:53 am
by olgc
:lol: Hi guys,

There is a practice in ETL coding:

Separating Extract, Transformation and Load to three or more jobs, in order to in case of failure of some step, the previous steps can be reused. That sounds good in theory. Could any one tell where is the practice documented?

Have a nice summer day,

Posted: Mon Jul 25, 2005 9:43 pm
by ray.wurlod
Training class DataStage Best Practices for one.

Freezing in an Australian winter (32C in Darwin!)

Posted: Tue Jul 26, 2005 12:54 am
by vmcburney
It is well documented in the archives of this site. Do a search for failover or recovery or banding to find threads on the subject.

Posted: Fri Aug 05, 2005 12:15 pm
by kumar_s
HI,
May be the function of calling the jobs using scritps can be verticalized.

Like you can try segregating the Extraction through a seperate script and transfromation through separate and like wise. so that the restrat point gets bifurcated.

Provided the tracability matrix should be perfect among jobs.

regards
kumar

Posted: Fri Aug 05, 2005 12:53 pm
by DaleK
If I understand the problem, it isn't a Best practice that is the problem, but your tool/method to schedule and run your jobs that is the problem.

We use our Mainframe scheduling tool to run our DataStage jobs. This tool allows us to rerun jobs. I guess I just have it a little easier then some of you.

Either way it sounds like you have one heck of a mess.
Best of luck.

Posted: Fri Aug 05, 2005 7:16 pm
by ray.wurlod
Sounds like some more thought needs to go into the design of your control structures and restart points. It's all doable, and gracefully. But it must be designed with care.

Posted: Tue Aug 09, 2005 9:42 am
by olgc
In order to split ETL to E.T.L, a integrated job can always be built first as ETL. When the job is tested, it can be split to 3, 4, even 20s E.T.L jobs, to make it confirmed to the practice and the maintenance and support more challenge, that's the best practice?

Posted: Tue Aug 09, 2005 3:08 pm
by ray.wurlod
We normally advocate four phases.
  • Extraction to first staging area.

    Jobs presupposed by Transformation phase (e.g. loading lookups).

    Transformation into second staging area.

    Loading from staging area into target.