Job Sequencer status Finished
Posted: Wed Dec 24, 2008 1:59 am
I have two sequencers, Child sequencer (sq_Child05) and Parent sequencer (sq_Parent).
Parent sequencer
|
|
Child sequencer
| |
| |
jb_16, jb_25
The child sequencer that calls two jobs jb_16 and jb_25. In the child sequencer i'm using Exception Handler stage that goes to a notification activity stage. I enabled "Add checkpoint so sequencer is restartable on failure" and "Automatically handle activation that fail" and i disabled "Donot checkpoint run".
The Parent sequencer that calls child sequencer and has no Exception Handler stage. I enabled "Add checkpoint so sequencer is restartable on failure" and "Automatically handle activation that fail" and i disabled "Donot checkpoint run".
Problem:
In the first run, job jb_16 got aborted and consecutively child & Parent sequencer also got aborted. Again when i restarted the Parent sequencer (Currently in Aborted/restartable state), the job jb_16 got aborted but the Child & Parent sequencer is getting Finished.
Question:
Why the Child & Parent sequencer is getting finished (during second run/restarted) even when the job gets aborted?
How to make the Child & Parent sequencer in Aborted/restartable state?
First run:
===========
Starting Job sq_Child05.
Environment variable settings:(...)
sq_Child05..JobControl (@Coordinator): Starting new run of checkpointed Sequence job
sq_Child05..JobControl (@ec_CreateConfig_05): Executed: $APT_GRID_HOME/sequencer.sh(...)
sq_Child05..JobControl (@ec_CreateConfig_05): Omitted checkpoint for execution of command '$APT_GRID_HOME/sequencer.sh'
sq_Child05 -> (jb_25): Job run requested(...)
sq_Child05..JobControl (DSRunJob): Waiting for job jb_25 to start
sq_Child05 -> (jb_16): Job run requested(...)
sq_Child05..JobControl (DSRunJob): Waiting for job jb_16 to start
sq_Child05..JobControl (DSWaitForJob): Job jb_16 has finished, status = 3 (Aborted)
sq_Child05..JobControl (@JA_RT16): Job jb_16 did not finish OK, status = 'Aborted'
sq_Child05..JobControl (@JA_RT16): Report on job: jb_16(...)
sq_Child05..JobControl (@JA_RT16): Controller problem: Unhandled abort encountered in job jb_16
sq_Child05..JobControl (@JA_RT16): Will execute error activity: EH_Verify_Error
sq_Child05..JobControl (DSSendMail): Sent message to 'abc@xyz.com'
sq_Child05..JobControl (DSWaitForJob): Waiting for job jb_25 to finish
sq_Child05..JobControl (DSWaitForJob): Job jb_25 has finished, status = 1 (Finished OK)
sq_Child05..JobControl (@JA_RT25): Report on job: jb_25(...)
sq_Child05..JobControl (@JA_RT25): Checkpointed run of job 'jb_25'
sq_Child05..JobControl (@Coordinator): Summary of sequence run(...)
sq_Child05..JobControl (fatal error from @Coordinator): Sequence job (restartable) will abort due to previous unrecoverable errors
Attempting to Cleanup after ABORT raised in stage sq_Child05..JobControl
(sq_Parent) <- sq_Child05: Job under control finished.
Second run:
============
Starting Job sq_Child05.(...)
Environment variable settings:(...)
sq_Child05..JobControl (@Coordinator): Sequence job is being restarted after failure(...)
sq_Child05..JobControl (@ec_CreateConfig_05): Executed: $APT_GRID_HOME/sequencer.sh(...)
sq_Child05..JobControl (@ec_CreateConfig_05): Omitted checkpoint for execution of command '$APT_GRID_HOME/sequencer.sh'
sq_Child05..JobControl (@JA_RT25): Skipped run of job 'jb_25' on restart
sq_Child05..JobControl (DSPrepareJob): Attempting to reset failed job jb_16
sq_Child05 -> (jb_16): Job reset requested
sq_Child05..JobControl (DSRunJob): Waiting for job jb_16 to start
sq_Child05..JobControl (DSWaitForJob): Waiting for job jb_16 to finish
sq_Child05..JobControl (DSWaitForJob): Job jb_16 has finished, status = 21 (Has been reset)
sq_Child05 -> (jb_16): Job run requested(...)
sq_Child05..JobControl (DSRunJob): Waiting for job jb_16 to start
sq_Child05..JobControl (DSWaitForJob): Waiting for job jb_16 to finish
sq_Child05..JobControl (DSWaitForJob): Job jb_16 has finished, status = 3 (Aborted)
sq_Child05..JobControl (@JA_RT16): Job jb_16 did not finish OK, status = 'Aborted'
sq_Child05..JobControl (@JA_RT16): Report on job: jb_16(...)
sq_Child05..JobControl (@JA_RT16): Controller problem: Unhandled abort encountered in job jb_16
sq_Child05..JobControl (@JA_RT16): Controller problem: Unhandled abort encountered in job jb_16
sq_Child05..JobControl (@JA_RT16): Will execute error activity: EH_Verify_Error
sq_Child05..JobControl (DSSendMail): Sent message to 'abc@xyz.com'(...)
sq_Child05..JobControl (@Coordinator): Summary of sequence run(...)
Finished Job sq_Child05.
(sq_Parent) <- sq_Child05: Job under control finished.
Parent sequencer
|
|
Child sequencer
| |
| |
jb_16, jb_25
The child sequencer that calls two jobs jb_16 and jb_25. In the child sequencer i'm using Exception Handler stage that goes to a notification activity stage. I enabled "Add checkpoint so sequencer is restartable on failure" and "Automatically handle activation that fail" and i disabled "Donot checkpoint run".
The Parent sequencer that calls child sequencer and has no Exception Handler stage. I enabled "Add checkpoint so sequencer is restartable on failure" and "Automatically handle activation that fail" and i disabled "Donot checkpoint run".
Problem:
In the first run, job jb_16 got aborted and consecutively child & Parent sequencer also got aborted. Again when i restarted the Parent sequencer (Currently in Aborted/restartable state), the job jb_16 got aborted but the Child & Parent sequencer is getting Finished.
Question:
Why the Child & Parent sequencer is getting finished (during second run/restarted) even when the job gets aborted?
How to make the Child & Parent sequencer in Aborted/restartable state?
First run:
===========
Starting Job sq_Child05.
Environment variable settings:(...)
sq_Child05..JobControl (@Coordinator): Starting new run of checkpointed Sequence job
sq_Child05..JobControl (@ec_CreateConfig_05): Executed: $APT_GRID_HOME/sequencer.sh(...)
sq_Child05..JobControl (@ec_CreateConfig_05): Omitted checkpoint for execution of command '$APT_GRID_HOME/sequencer.sh'
sq_Child05 -> (jb_25): Job run requested(...)
sq_Child05..JobControl (DSRunJob): Waiting for job jb_25 to start
sq_Child05 -> (jb_16): Job run requested(...)
sq_Child05..JobControl (DSRunJob): Waiting for job jb_16 to start
sq_Child05..JobControl (DSWaitForJob): Job jb_16 has finished, status = 3 (Aborted)
sq_Child05..JobControl (@JA_RT16): Job jb_16 did not finish OK, status = 'Aborted'
sq_Child05..JobControl (@JA_RT16): Report on job: jb_16(...)
sq_Child05..JobControl (@JA_RT16): Controller problem: Unhandled abort encountered in job jb_16
sq_Child05..JobControl (@JA_RT16): Will execute error activity: EH_Verify_Error
sq_Child05..JobControl (DSSendMail): Sent message to 'abc@xyz.com'
sq_Child05..JobControl (DSWaitForJob): Waiting for job jb_25 to finish
sq_Child05..JobControl (DSWaitForJob): Job jb_25 has finished, status = 1 (Finished OK)
sq_Child05..JobControl (@JA_RT25): Report on job: jb_25(...)
sq_Child05..JobControl (@JA_RT25): Checkpointed run of job 'jb_25'
sq_Child05..JobControl (@Coordinator): Summary of sequence run(...)
sq_Child05..JobControl (fatal error from @Coordinator): Sequence job (restartable) will abort due to previous unrecoverable errors
Attempting to Cleanup after ABORT raised in stage sq_Child05..JobControl
(sq_Parent) <- sq_Child05: Job under control finished.
Second run:
============
Starting Job sq_Child05.(...)
Environment variable settings:(...)
sq_Child05..JobControl (@Coordinator): Sequence job is being restarted after failure(...)
sq_Child05..JobControl (@ec_CreateConfig_05): Executed: $APT_GRID_HOME/sequencer.sh(...)
sq_Child05..JobControl (@ec_CreateConfig_05): Omitted checkpoint for execution of command '$APT_GRID_HOME/sequencer.sh'
sq_Child05..JobControl (@JA_RT25): Skipped run of job 'jb_25' on restart
sq_Child05..JobControl (DSPrepareJob): Attempting to reset failed job jb_16
sq_Child05 -> (jb_16): Job reset requested
sq_Child05..JobControl (DSRunJob): Waiting for job jb_16 to start
sq_Child05..JobControl (DSWaitForJob): Waiting for job jb_16 to finish
sq_Child05..JobControl (DSWaitForJob): Job jb_16 has finished, status = 21 (Has been reset)
sq_Child05 -> (jb_16): Job run requested(...)
sq_Child05..JobControl (DSRunJob): Waiting for job jb_16 to start
sq_Child05..JobControl (DSWaitForJob): Waiting for job jb_16 to finish
sq_Child05..JobControl (DSWaitForJob): Job jb_16 has finished, status = 3 (Aborted)
sq_Child05..JobControl (@JA_RT16): Job jb_16 did not finish OK, status = 'Aborted'
sq_Child05..JobControl (@JA_RT16): Report on job: jb_16(...)
sq_Child05..JobControl (@JA_RT16): Controller problem: Unhandled abort encountered in job jb_16
sq_Child05..JobControl (@JA_RT16): Controller problem: Unhandled abort encountered in job jb_16
sq_Child05..JobControl (@JA_RT16): Will execute error activity: EH_Verify_Error
sq_Child05..JobControl (DSSendMail): Sent message to 'abc@xyz.com'(...)
sq_Child05..JobControl (@Coordinator): Summary of sequence run(...)
Finished Job sq_Child05.
(sq_Parent) <- sq_Child05: Job under control finished.