Sequence - Checkpoints - Restartability

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
nvalia
Premium Member
Premium Member
Posts: 180
Joined: Thu May 26, 2005 6:44 am

Sequence - Checkpoints - Restartability

Post by nvalia »

Hi All,
Designing the ETL cycle for Restartability.

The Main Sequence (that controls the end to end ETL cycle dependency) job will be invoked using a Wrapper script that does initial validation of the state of the job and resets accordingly and generates logs file.
One Child Sequence job will control all Staging jobs and another Child Sequence job the loading of from Staging to the Star Schema. Additional jobs as needed for Pre and Post processes.

I understand we can use Checkpoints for Restart ability in case of job failure so the Sequence Restarts from the Aborted job onward and not from the start

What are the Pros and Cons of using Check Points approach, like does it impact performance in any way or anything else we should be aware of?

Other option would be a Metadata Driven Approach where all job names are in a process table and before every run, check the status if Success/Failure (via a Flag in the table - Default 'N') for that job and proceed accordingly?
But this means additional design and build of scripts and complexity to the process

Anyone implemented this approach and if you could share your experience

Thanks,
NV
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

There are no "performance impacts" for using checkpoints in a sequence job. And I can't imagine the need to re-invent the wheel or the additional complexity that would add when almost literally all you have to do is check a box.

Everything will automatically get a checkpoint but you do have the option to say "Do not checkpoint" for any task that would always need to run regardless of the overall status. In case of errors in the sequence job, make sure it aborts rather than just stops, that's what "activates" the checkpoints in a manner of speaking such that your final job status is "Aborted Restartable" rather than simply "Aborted". And then you can either reset the job to start over from the beginning or simply run it and it will restart at the failure point as you noted.
-craig

"You can never have too many knives" -- Logan Nine Fingers
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

Our design adds just one thing to Craig's important call for simplicity: every checkpoint sequence stage has a terminator activity on the abort condition link as the "exit" point. It lets you add a final message text but more importantly lets you control other processes that may be running.

About restartability: you mention child sequences. I don't think that's a good design decision myself, because it adds an extra layer to everything and makes the Director output more complex. But the main point is that it automatically resets parallel jobs under the job activity of the abort, making your wrapper script query of job statuses redundant. The checkpoint automatically sess the abort status and issues a reset prior to restarting the job.

Edit: we use looping in our job sequences. A checkpoint in the loop also restarts on the aborted iteration of the loop. We find this to be very beneficial.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
nvalia
Premium Member
Premium Member
Posts: 180
Joined: Thu May 26, 2005 6:44 am

Post by nvalia »

Thank you Chulett and Franklin for the detailed response.

Franklin, I was thinking of Child Sequence only in context of NON Check Point Approach. But based on the comments I will got above I will definately go with the Check Point approach.
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

You're welcome. The best thing about DataStage is that it can do so many things. The worst thing about DataStage is that it lets you do too many things. 8)
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Make sure your child sequences abort as well when there's a problem. That will be communicated upstream so your main sequence can be aborted as well. Then a restart will find its way back down the rabbit hole to where it needs to restart no matter how deep it needs to go. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

An alternative to Craig's most recent suggestion is to make sure that your sub-sequences throw a warning (but do not need to abort). This, too, can be detected upstream (set parent sequences to log a warning if any activity does not finish with a status of OK).

If you really want an esoteric solution, the sub-sequence can (through a Routine activity) log a warning in its controller's log.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I was assuming the child / sub-sequences were checkpointed as well...
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply