One large sequence or separate sequences?

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
RodBarnes
Charter Member
Charter Member
Posts: 182
Joined: Fri Mar 18, 2005 2:10 pm

One large sequence or separate sequences?

Post by RodBarnes »

I am sure this has been asked before but I know we all enjoy sharing our opinions. :-)

We have a complete ETL sequence and now have the case where an additional dimension and fact table is needed. These new tables are related to the existing tables by using shared dimensions. So....

I can think of reasons why to do put these new jobs in a separate sequence; e.g., modularity and minimizing cross-talk between engineers when working on the modules.

I can also think of reasons why they should end up in the same sequence; e.g., triggers between jobs, timing of the jobs not having to wait, etc.

What is the best practices here?
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

I say separate small sequences with only dependent jobs in one sequences. Say you have customer snowfalked off of order then order would need to follow customer then I would put these in the same sequence to control the jobs.
Mamu Kim
RodBarnes
Charter Member
Charter Member
Posts: 182
Joined: Fri Mar 18, 2005 2:10 pm

Post by RodBarnes »

Thanks for your input. After discussing this some, we've concluded that we're going to organize our project so the shared dimensions will be built in one sequence, with individual sequences for each of the additional dimensions and fact tables. We'll probably have a single master sequence that will orchestrate everything (so we'll have the benefit of using the triggers between sequences) and it will handle the task management and logging.

My original question mostly came from consideration for rollback and error recovery, and how we could best manage that. But we've decided probably the safest and most accurate is just to rely on backups.

I'd still like to hear from others who've more experience than I with ETL.
kduke wrote:I say separate small sequences with only dependent jobs in one sequences.
Post Reply