Job Sequencer status Finished
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 38
- Joined: Sun Mar 25, 2007 11:05 pm
- Location: chennai
Job Sequencer status Finished
I have two sequencers, Child sequencer (sq_Child05) and Parent sequencer (sq_Parent).
Parent sequencer
|
|
Child sequencer
| |
| |
jb_16, jb_25
The child sequencer that calls two jobs jb_16 and jb_25. In the child sequencer i'm using Exception Handler stage that goes to a notification activity stage. I enabled "Add checkpoint so sequencer is restartable on failure" and "Automatically handle activation that fail" and i disabled "Donot checkpoint run".
The Parent sequencer that calls child sequencer and has no Exception Handler stage. I enabled "Add checkpoint so sequencer is restartable on failure" and "Automatically handle activation that fail" and i disabled "Donot checkpoint run".
Problem:
In the first run, job jb_16 got aborted and consecutively child & Parent sequencer also got aborted. Again when i restarted the Parent sequencer (Currently in Aborted/restartable state), the job jb_16 got aborted but the Child & Parent sequencer is getting Finished.
Question:
Why the Child & Parent sequencer is getting finished (during second run/restarted) even when the job gets aborted?
How to make the Child & Parent sequencer in Aborted/restartable state?
First run:
===========
Starting Job sq_Child05.
Environment variable settings:(...)
sq_Child05..JobControl (@Coordinator): Starting new run of checkpointed Sequence job
sq_Child05..JobControl (@ec_CreateConfig_05): Executed: $APT_GRID_HOME/sequencer.sh(...)
sq_Child05..JobControl (@ec_CreateConfig_05): Omitted checkpoint for execution of command '$APT_GRID_HOME/sequencer.sh'
sq_Child05 -> (jb_25): Job run requested(...)
sq_Child05..JobControl (DSRunJob): Waiting for job jb_25 to start
sq_Child05 -> (jb_16): Job run requested(...)
sq_Child05..JobControl (DSRunJob): Waiting for job jb_16 to start
sq_Child05..JobControl (DSWaitForJob): Job jb_16 has finished, status = 3 (Aborted)
sq_Child05..JobControl (@JA_RT16): Job jb_16 did not finish OK, status = 'Aborted'
sq_Child05..JobControl (@JA_RT16): Report on job: jb_16(...)
sq_Child05..JobControl (@JA_RT16): Controller problem: Unhandled abort encountered in job jb_16
sq_Child05..JobControl (@JA_RT16): Will execute error activity: EH_Verify_Error
sq_Child05..JobControl (DSSendMail): Sent message to 'abc@xyz.com'
sq_Child05..JobControl (DSWaitForJob): Waiting for job jb_25 to finish
sq_Child05..JobControl (DSWaitForJob): Job jb_25 has finished, status = 1 (Finished OK)
sq_Child05..JobControl (@JA_RT25): Report on job: jb_25(...)
sq_Child05..JobControl (@JA_RT25): Checkpointed run of job 'jb_25'
sq_Child05..JobControl (@Coordinator): Summary of sequence run(...)
sq_Child05..JobControl (fatal error from @Coordinator): Sequence job (restartable) will abort due to previous unrecoverable errors
Attempting to Cleanup after ABORT raised in stage sq_Child05..JobControl
(sq_Parent) <- sq_Child05: Job under control finished.
Second run:
============
Starting Job sq_Child05.(...)
Environment variable settings:(...)
sq_Child05..JobControl (@Coordinator): Sequence job is being restarted after failure(...)
sq_Child05..JobControl (@ec_CreateConfig_05): Executed: $APT_GRID_HOME/sequencer.sh(...)
sq_Child05..JobControl (@ec_CreateConfig_05): Omitted checkpoint for execution of command '$APT_GRID_HOME/sequencer.sh'
sq_Child05..JobControl (@JA_RT25): Skipped run of job 'jb_25' on restart
sq_Child05..JobControl (DSPrepareJob): Attempting to reset failed job jb_16
sq_Child05 -> (jb_16): Job reset requested
sq_Child05..JobControl (DSRunJob): Waiting for job jb_16 to start
sq_Child05..JobControl (DSWaitForJob): Waiting for job jb_16 to finish
sq_Child05..JobControl (DSWaitForJob): Job jb_16 has finished, status = 21 (Has been reset)
sq_Child05 -> (jb_16): Job run requested(...)
sq_Child05..JobControl (DSRunJob): Waiting for job jb_16 to start
sq_Child05..JobControl (DSWaitForJob): Waiting for job jb_16 to finish
sq_Child05..JobControl (DSWaitForJob): Job jb_16 has finished, status = 3 (Aborted)
sq_Child05..JobControl (@JA_RT16): Job jb_16 did not finish OK, status = 'Aborted'
sq_Child05..JobControl (@JA_RT16): Report on job: jb_16(...)
sq_Child05..JobControl (@JA_RT16): Controller problem: Unhandled abort encountered in job jb_16
sq_Child05..JobControl (@JA_RT16): Controller problem: Unhandled abort encountered in job jb_16
sq_Child05..JobControl (@JA_RT16): Will execute error activity: EH_Verify_Error
sq_Child05..JobControl (DSSendMail): Sent message to 'abc@xyz.com'(...)
sq_Child05..JobControl (@Coordinator): Summary of sequence run(...)
Finished Job sq_Child05.
(sq_Parent) <- sq_Child05: Job under control finished.
Parent sequencer
|
|
Child sequencer
| |
| |
jb_16, jb_25
The child sequencer that calls two jobs jb_16 and jb_25. In the child sequencer i'm using Exception Handler stage that goes to a notification activity stage. I enabled "Add checkpoint so sequencer is restartable on failure" and "Automatically handle activation that fail" and i disabled "Donot checkpoint run".
The Parent sequencer that calls child sequencer and has no Exception Handler stage. I enabled "Add checkpoint so sequencer is restartable on failure" and "Automatically handle activation that fail" and i disabled "Donot checkpoint run".
Problem:
In the first run, job jb_16 got aborted and consecutively child & Parent sequencer also got aborted. Again when i restarted the Parent sequencer (Currently in Aborted/restartable state), the job jb_16 got aborted but the Child & Parent sequencer is getting Finished.
Question:
Why the Child & Parent sequencer is getting finished (during second run/restarted) even when the job gets aborted?
How to make the Child & Parent sequencer in Aborted/restartable state?
First run:
===========
Starting Job sq_Child05.
Environment variable settings:(...)
sq_Child05..JobControl (@Coordinator): Starting new run of checkpointed Sequence job
sq_Child05..JobControl (@ec_CreateConfig_05): Executed: $APT_GRID_HOME/sequencer.sh(...)
sq_Child05..JobControl (@ec_CreateConfig_05): Omitted checkpoint for execution of command '$APT_GRID_HOME/sequencer.sh'
sq_Child05 -> (jb_25): Job run requested(...)
sq_Child05..JobControl (DSRunJob): Waiting for job jb_25 to start
sq_Child05 -> (jb_16): Job run requested(...)
sq_Child05..JobControl (DSRunJob): Waiting for job jb_16 to start
sq_Child05..JobControl (DSWaitForJob): Job jb_16 has finished, status = 3 (Aborted)
sq_Child05..JobControl (@JA_RT16): Job jb_16 did not finish OK, status = 'Aborted'
sq_Child05..JobControl (@JA_RT16): Report on job: jb_16(...)
sq_Child05..JobControl (@JA_RT16): Controller problem: Unhandled abort encountered in job jb_16
sq_Child05..JobControl (@JA_RT16): Will execute error activity: EH_Verify_Error
sq_Child05..JobControl (DSSendMail): Sent message to 'abc@xyz.com'
sq_Child05..JobControl (DSWaitForJob): Waiting for job jb_25 to finish
sq_Child05..JobControl (DSWaitForJob): Job jb_25 has finished, status = 1 (Finished OK)
sq_Child05..JobControl (@JA_RT25): Report on job: jb_25(...)
sq_Child05..JobControl (@JA_RT25): Checkpointed run of job 'jb_25'
sq_Child05..JobControl (@Coordinator): Summary of sequence run(...)
sq_Child05..JobControl (fatal error from @Coordinator): Sequence job (restartable) will abort due to previous unrecoverable errors
Attempting to Cleanup after ABORT raised in stage sq_Child05..JobControl
(sq_Parent) <- sq_Child05: Job under control finished.
Second run:
============
Starting Job sq_Child05.(...)
Environment variable settings:(...)
sq_Child05..JobControl (@Coordinator): Sequence job is being restarted after failure(...)
sq_Child05..JobControl (@ec_CreateConfig_05): Executed: $APT_GRID_HOME/sequencer.sh(...)
sq_Child05..JobControl (@ec_CreateConfig_05): Omitted checkpoint for execution of command '$APT_GRID_HOME/sequencer.sh'
sq_Child05..JobControl (@JA_RT25): Skipped run of job 'jb_25' on restart
sq_Child05..JobControl (DSPrepareJob): Attempting to reset failed job jb_16
sq_Child05 -> (jb_16): Job reset requested
sq_Child05..JobControl (DSRunJob): Waiting for job jb_16 to start
sq_Child05..JobControl (DSWaitForJob): Waiting for job jb_16 to finish
sq_Child05..JobControl (DSWaitForJob): Job jb_16 has finished, status = 21 (Has been reset)
sq_Child05 -> (jb_16): Job run requested(...)
sq_Child05..JobControl (DSRunJob): Waiting for job jb_16 to start
sq_Child05..JobControl (DSWaitForJob): Waiting for job jb_16 to finish
sq_Child05..JobControl (DSWaitForJob): Job jb_16 has finished, status = 3 (Aborted)
sq_Child05..JobControl (@JA_RT16): Job jb_16 did not finish OK, status = 'Aborted'
sq_Child05..JobControl (@JA_RT16): Report on job: jb_16(...)
sq_Child05..JobControl (@JA_RT16): Controller problem: Unhandled abort encountered in job jb_16
sq_Child05..JobControl (@JA_RT16): Controller problem: Unhandled abort encountered in job jb_16
sq_Child05..JobControl (@JA_RT16): Will execute error activity: EH_Verify_Error
sq_Child05..JobControl (DSSendMail): Sent message to 'abc@xyz.com'(...)
sq_Child05..JobControl (@Coordinator): Summary of sequence run(...)
Finished Job sq_Child05.
(sq_Parent) <- sq_Child05: Job under control finished.
thanks & Regards
A.S.Porkalai Lakshmi
A.S.Porkalai Lakshmi
-
- Participant
- Posts: 25
- Joined: Fri Jan 11, 2008 12:49 am
- Location: Pune, India
-
- Participant
- Posts: 38
- Joined: Sun Mar 25, 2007 11:05 pm
- Location: chennai
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Go to the job sequence's job properties, select "Automatically handle activities that fail" and re-compile. Without this set, jobs under control can abort and the sequence finish with an OK status - you have to inspect the job sequence's log to determine the cause. With it checked, and without explicit error handling in the job sequence itself, the job sequence will abort if any of its activities reports a failure.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 25
- Joined: Fri Jan 11, 2008 12:49 am
- Location: Pune, India
-
- Participant
- Posts: 38
- Joined: Sun Mar 25, 2007 11:05 pm
- Location: chennai
Hi Ray,
The option "Automatically handle activities that fail" has been set both in Parent & Child sequence. Still the Parent sequencer is getting Finished (i,e during second run). I want the Parent sequencer to be in Aborted/restartable state whenever the jobs are aborted.
The option "Automatically handle activities that fail" has been set both in Parent & Child sequence. Still the Parent sequencer is getting Finished (i,e during second run). I want the Parent sequencer to be in Aborted/restartable state whenever the jobs are aborted.
thanks & Regards
A.S.Porkalai Lakshmi
A.S.Porkalai Lakshmi
-
- Participant
- Posts: 38
- Joined: Sun Mar 25, 2007 11:05 pm
- Location: chennai
-
- Participant
- Posts: 597
- Joined: Fri Apr 29, 2005 6:19 am
- Location: Singapore
-
- Participant
- Posts: 38
- Joined: Sun Mar 25, 2007 11:05 pm
- Location: chennai
Hi,
There is no trigger page in both the jobs and all are running parallel. The Exception Activity stage is independent and it is not connected with jb_16 and jb_25.
You can see the First run's Job log, The child sequence has raised an ABORT "Attempting to Cleanup after ABORT raised in stage sq_Child05..JobControl" but in the Second run's Job log there is no Abort entry.
I want to know why the Child sequence is not aborted in the second run?
Is there any way to abort the Child sequence whenever job aborts?
There is no trigger page in both the jobs and all are running parallel. The Exception Activity stage is independent and it is not connected with jb_16 and jb_25.
You can see the First run's Job log, The child sequence has raised an ABORT "Attempting to Cleanup after ABORT raised in stage sq_Child05..JobControl" but in the Second run's Job log there is no Abort entry.
I want to know why the Child sequence is not aborted in the second run?
Is there any way to abort the Child sequence whenever job aborts?
thanks & Regards
A.S.Porkalai Lakshmi
A.S.Porkalai Lakshmi
-
- Participant
- Posts: 597
- Joined: Fri Apr 29, 2005 6:19 am
- Location: Singapore
Just do a test.. Keep them running in parallel but add some dummy process after two jobs with some triggers on the output links of 2 jobs. For your scenario, you might need some "sequencing" so that the selected options will work.
Kandy
_________________
Try and Try again…You will succeed atlast!!
_________________
Try and Try again…You will succeed atlast!!
-
- Participant
- Posts: 25
- Joined: Fri Jan 11, 2008 12:49 am
- Location: Pune, India
Hi
Not sure if you found a solution to your problem, but here's something that might be useful :
(This is from DS Director manual)
If, during sequence execution, the flow diverts to an error handling
stage, DataStage does not checkpoint anything more. This is to
ensure that stages in the error handling path will not be skipped if
the job is retarted and another error is encountered.
Not sure if you found a solution to your problem, but here's something that might be useful :
(This is from DS Director manual)
If, during sequence execution, the flow diverts to an error handling
stage, DataStage does not checkpoint anything more. This is to
ensure that stages in the error handling path will not be skipped if
the job is retarted and another error is encountered.