Job control process (pid 1084) has failed
Posted: Mon Nov 03, 2008 10:04 am
Hi,
I am having a very frustrating problem:
We have a sequence job that runs various server jobs and other sequence jobs. This job is set up to give an email notification if it fails.
Sometimes the job will not fail but not complete.
For example
mainSequencejob runs various server jobs then runs sequencejob1 which runs serverjob1 and serverjob2 than mainSequencejob moves onto other server and sequence jobs.
Both serverjob1 and serverjob2 complete successfully and sequencejob1 completes sussessfully but mainSequencejob has a warning: "Job control process (pid 1084) has failed"
The frustrating part is this happens sometime during the night but there is no email notification. As soon as someone logs into datastage director and goes to view the logs the warning appears and the email is sent. Often the job will then continue, so I get a sequence job with a status "aborted" but the job is still running. It is almost as if the whole ETL job is in limbo until someone logs in.
We also can't reproduce it. It will happen one day and not the next.
I'm taking a wild guess that our AS/400 is dropping the process and Datastage is not being informed. Since company policy does not allow me to look at the production server and the sys-admins saying "no nothing happened last night" I can only guess
I don't understand why datastage seems to sit and wait if the process ID has been dropped.
Do you think this a Datastage issue or a AS/400 issue?
Stats:
The Source DB is AS/400 DB2
Target DW is SQL server 2005
We connect using ODBC
Datastage version is 7.5.1
I am having a very frustrating problem:
We have a sequence job that runs various server jobs and other sequence jobs. This job is set up to give an email notification if it fails.
Sometimes the job will not fail but not complete.
For example
mainSequencejob runs various server jobs then runs sequencejob1 which runs serverjob1 and serverjob2 than mainSequencejob moves onto other server and sequence jobs.
Both serverjob1 and serverjob2 complete successfully and sequencejob1 completes sussessfully but mainSequencejob has a warning: "Job control process (pid 1084) has failed"
The frustrating part is this happens sometime during the night but there is no email notification. As soon as someone logs into datastage director and goes to view the logs the warning appears and the email is sent. Often the job will then continue, so I get a sequence job with a status "aborted" but the job is still running. It is almost as if the whole ETL job is in limbo until someone logs in.
We also can't reproduce it. It will happen one day and not the next.
I'm taking a wild guess that our AS/400 is dropping the process and Datastage is not being informed. Since company policy does not allow me to look at the production server and the sys-admins saying "no nothing happened last night" I can only guess
I don't understand why datastage seems to sit and wait if the process ID has been dropped.
Do you think this a Datastage issue or a AS/400 issue?
Stats:
The Source DB is AS/400 DB2
Target DW is SQL server 2005
We connect using ODBC
Datastage version is 7.5.1