Checkpoint restart
Moderators: chulett, rschirm, roy
Checkpoint restart
As you may known, there is a feature called "Checkpoint restarted" on DS Job Sequence, however, it seems that it can only support one level.
My case as below.
Top level Seq000 : contain two sequence
Seq001
Seq002
Second level :
Seq001 contail three jobs
Job001
Job002
Job003
Seq002 contail three jobs
Job004
Job005
Job006
If Seq001 normal end and Job005 aborted, then Seq002 and Seq000 aborted. According to checkpoint restart feature, it should be restarted from Job005, however, it can only restart from Seq002 ---> Job004 (not Job005).
Is it a well known case ? Does anyone know will it be supported for two and above level in later version released ?
My case as below.
Top level Seq000 : contain two sequence
Seq001
Seq002
Second level :
Seq001 contail three jobs
Job001
Job002
Job003
Seq002 contail three jobs
Job004
Job005
Job006
If Seq001 normal end and Job005 aborted, then Seq002 and Seq000 aborted. According to checkpoint restart feature, it should be restarted from Job005, however, it can only restart from Seq002 ---> Job004 (not Job005).
Is it a well known case ? Does anyone know will it be supported for two and above level in later version released ?
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Hi,
Do you see the jobs getting reset before running again?
When they are in aborted state do you see only reset option in director for those sequence jobs?
Do you see the jobs getting reset before running again?
When they are in aborted state do you see only reset option in director for those sequence jobs?
Roy R.
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
![Image](http://www.worldcommunitygrid.org/images/logo.gif)
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
![Image](http://www.worldcommunitygrid.org/images/logo.gif)
Ok and do the log shows they have been reset when you rerun the top level sequence job?
Roy R.
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
![Image](http://www.worldcommunitygrid.org/images/logo.gif)
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
![Image](http://www.worldcommunitygrid.org/images/logo.gif)
oh, yes, I know what is happening, the sub level sequence has been reset, so that it start from the beginning.
That means for sub level sequence we should set the option "Run" instead of "Reset if required, then run", right ?
That means for sub level sequence we should set the option "Run" instead of "Reset if required, then run", right ?
roy wrote:Ok and do the log shows they have been reset when you rerun the top level sequence job?
However, I have encountered another case
Options:
Add Checkpoints so sequence is restartable on failture (ON)
Automatically handle activity that fail (ON)
In my previous case, I have used "Exception Handler" and "Routine Activity" (call DSJobAbort) to force the sequence aborted if any job aborted. In this case, the Status of the sub-level sequence is Aborted/Restartable, so no matter what option I have set in top-level sequence (say "Run" or "Reset if required, then run"), when re-running top-level, it will reset the aborted sub-level.
If I don't use the "Exception Handler" to force it aborted, then the sub-level 's status is Finished/Restartable (it stopped at the aborted job). However, the top-level will not detect the abnormal case from sub-level and will continue to process the next Job Activity in top-level.
That is the point. If they have no dependence, that is fine, but most of the case, we have.
Now supposed the top-level finished normally (supposed they have no dependence) When I re-run the top-level, it will start from the beginning, during re-running the abnormal sub-level, it will not be reseted (because the previous status is Finished/Restartable) and start from the aborted job. But that isn't what I wanted, because top-level start from the beginning.
Grateful if anyone can give me some lights on the Checkpoint Restart usage.
Many many thanks !
Regards,
Benny
![Idea :idea:](./images/smilies/icon_idea.gif)
Options:
Add Checkpoints so sequence is restartable on failture (ON)
Automatically handle activity that fail (ON)
In my previous case, I have used "Exception Handler" and "Routine Activity" (call DSJobAbort) to force the sequence aborted if any job aborted. In this case, the Status of the sub-level sequence is Aborted/Restartable, so no matter what option I have set in top-level sequence (say "Run" or "Reset if required, then run"), when re-running top-level, it will reset the aborted sub-level.
If I don't use the "Exception Handler" to force it aborted, then the sub-level 's status is Finished/Restartable (it stopped at the aborted job). However, the top-level will not detect the abnormal case from sub-level and will continue to process the next Job Activity in top-level.
That is the point. If they have no dependence, that is fine, but most of the case, we have.
Now supposed the top-level finished normally (supposed they have no dependence) When I re-run the top-level, it will start from the beginning, during re-running the abnormal sub-level, it will not be reseted (because the previous status is Finished/Restartable) and start from the aborted job. But that isn't what I wanted, because top-level start from the beginning.
Grateful if anyone can give me some lights on the Checkpoint Restart usage.
Many many thanks !
Regards,
Benny
benny.lbs wrote:oh, yes, I know what is happening, the sub level sequence has been reset, so that it start from the beginning.
That means for sub level sequence we should set the option "Run" instead of "Reset if required, then run", right ?
roy wrote:Ok and do the log shows they have been reset when you rerun the top level sequence job?
![Idea :idea:](./images/smilies/icon_idea.gif)
I am a DS beginner, I have also encountered the following case, anyone can help ?
Thanks in advance!
Thanks in advance!
benny.lbs wrote:However, I have encountered another case
Options:
Add Checkpoints so sequence is restartable on failture (ON)
Automatically handle activity that fail (ON)
In my previous case, I have used "Exception Handler" and "Routine Activity" (call DSJobAbort) to force the sequence aborted if any job aborted. In this case, the Status of the sub-level sequence is Aborted/Restartable, so no matter what option I have set in top-level sequence (say "Run" or "Reset if required, then run"), when re-running top-level, it will reset the aborted sub-level.
If I don't use the "Exception Handler" to force it aborted, then the sub-level 's status is Finished/Restartable (it stopped at the aborted job). However, the top-level will not detect the abnormal case from sub-level and will continue to process the next Job Activity in top-level.
That is the point. If they have no dependence, that is fine, but most of the case, we have.
Now supposed the top-level finished normally (supposed they have no dependence) When I re-run the top-level, it will start from the beginning, during re-running the abnormal sub-level, it will not be reseted (because the previous status is Finished/Restartable) and start from the aborted job. But that isn't what I wanted, because top-level start from the beginning.
Grateful if anyone can give me some lights on the Checkpoint Restart usage.
Many many thanks !
Regards,
Benny
benny.lbs wrote:oh, yes, I know what is happening, the sub level sequence has been reset, so that it start from the beginning.
That means for sub level sequence we should set the option "Run" instead of "Reset if required, then run", right ?
roy wrote:Ok and do the log shows they have been reset when you rerun the top level sequence job?
-
- Charter Member
- Posts: 143
- Joined: Thu Nov 04, 2004 6:53 am
benny.lbs wrote:However, I have encountered another case
Options:
Add Checkpoints so sequence is restartable on failture (ON)
Automatically handle activity that fail (ON)
In my previous case, I have used "Exception Handler" and "Routine Activity" (call DSJobAbort) to force the sequence aborted if any job aborted. In this case, the Status of the sub-level sequence is Aborted/Restartable, so no matter what option I have set in top-level sequence (say "Run" or "Reset if required, then run"), when re-running top-level, it will reset the aborted sub-level.
If I don't use the "Exception Handler" to force it aborted, then the sub-level 's status is Finished/Restartable (it stopped at the aborted job). However, the top-level will not detect the abnormal case from sub-level and will continue to process the next Job Activity in top-level.
That is the point. If they have no dependence, that is fine, but most of the case, we have.
Now supposed the top-level finished normally (supposed they have no dependence) When I re-run the top-level, it will start from the beginning, during re-running the abnormal sub-level, it will not be reseted (because the previous status is Finished/Restartable) and start from the aborted job. But that isn't what I wanted, because top-level start from the beginning.
Grateful if anyone can give me some lights on the Checkpoint Restart usage.
Many many thanks !
Regards,
Benny
benny.lbs wrote:oh, yes, I know what is happening, the sub level sequence has been reset, so that it start from the beginning.
That means for sub level sequence we should set the option "Run" instead of "Reset if required, then run", right ?
roy wrote:Ok and do the log shows they have been reset when you rerun the top level sequence job?
When you use the check point and restart, the code should be designed in such a way that any abort at the bottom level should bubble all the way up. This is good with those options. Another catch here is, when you make your jobs restartable and set the ""Reset if required, then run" , the restart will not work. When you go for ""Reset if required, then run" this option , when there is an abort even the checkpoint will be erased.
So, use the restartablility option with "Run" option.
This is not true. We do this all the time. It should only run the jobs which have not run or aborted when you restart a sequence. This works fine. We use it in all our sequences. All our jobs "Reset if required, then run".When you go for ""Reset if required, then run" this option , when there is an abort even the checkpoint will be erased.
Mamu Kim
kduke,
Actually, what I encountered is the checkpoint was erased. I am getting puzzle for a long time.
Actually, what I encountered is the checkpoint was erased. I am getting puzzle for a long time.
kduke wrote:This is not true. We do this all the time. It should only run the jobs which have not run or aborted when you restart a sequence. This works fine. We use it in all our sequences. All our jobs "Reset if required, then run".When you go for ""Reset if required, then run" this option , when there is an abort even the checkpoint will be erased.