Hi guys,
There is a practice in ETL coding:
Separating Extract, Transformation and Load to three or more jobs, in order to in case of failure of some step, the previous steps can be reused. That sounds good in theory. Could any one tell where is the practice documented?
Have a nice summer day,
Separating Extract, Transform and Load to three or more jobs
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Training class DataStage Best Practices for one.
Freezing in an Australian winter (32C in Darwin!)
Freezing in an Australian winter (32C in Darwin!)
Last edited by ray.wurlod on Tue Jul 26, 2005 1:37 am, edited 1 time in total.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
It is well documented in the archives of this site. Do a search for failover or recovery or banding to find threads on the subject.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
HI,
May be the function of calling the jobs using scritps can be verticalized.
Like you can try segregating the Extraction through a seperate script and transfromation through separate and like wise. so that the restrat point gets bifurcated.
Provided the tracability matrix should be perfect among jobs.
regards
kumar
May be the function of calling the jobs using scritps can be verticalized.
Like you can try segregating the Extraction through a seperate script and transfromation through separate and like wise. so that the restrat point gets bifurcated.
Provided the tracability matrix should be perfect among jobs.
regards
kumar
If I understand the problem, it isn't a Best practice that is the problem, but your tool/method to schedule and run your jobs that is the problem.
We use our Mainframe scheduling tool to run our DataStage jobs. This tool allows us to rerun jobs. I guess I just have it a little easier then some of you.
Either way it sounds like you have one heck of a mess.
Best of luck.
We use our Mainframe scheduling tool to run our DataStage jobs. This tool allows us to rerun jobs. I guess I just have it a little easier then some of you.
Either way it sounds like you have one heck of a mess.
Best of luck.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Sounds like some more thought needs to go into the design of your control structures and restart points. It's all doable, and gracefully. But it must be designed with care.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
We normally advocate four phases.
- Extraction to first staging area.
Jobs presupposed by Transformation phase (e.g. loading lookups).
Transformation into second staging area.
Loading from staging area into target.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.