Hi,
One of my jobs is failing with the following error:
Parallel job reports failure (code 139)
Surprisingly, the same job ran fine in dev environment but is failing in QA environment. If I create a copy of the job and run the copy, it runs fine.
Earlier also we faced the same issue with one of the jobs and we created a copy of it and ran it job copy. I think sometimes the job log gets locked and hence we can't run that job. Since log is locked we can't even rename or delete the job.
The problem is we can't keep on creating copies of jobs if the error keeps on coming. Is there a solution anyone is aware of?
Job failing due to (code 139)
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 38
- Joined: Tue Jun 19, 2012 11:03 pm
- Location: India
-
- Premium Member
- Posts: 38
- Joined: Tue Jun 19, 2012 11:03 pm
- Location: India
The project might be corrupt. You should probably run ASBNode/bin/SyncProject.sh (with -Report option) to see if the project is corrupt. You can only run that tool if project is not in use at all! This means no clients active, no jobs active.
If it reports corruption then you have the option to fix it (by using -Fix option). However, be warned - sometimes it 'fixes' stuff by removing it! So backup all the jobs prior to "fixing" them.
In terms of what might be causing the problems - you might be seeing a lot of "zombies" in the background that are keeping jobs open and locked. When a developer loses his connection to DataStage (computer goes into hibernation, or the wireless gets disconnected) it will leave background processes running that will have the job "locked". Since they are alive but basically dead - I call them zombies! Setting a value for the inactivity timeout in the admin client will cause those zombies to go away once they are inactive for that length of time. That will cause them to release their locks. Just don't set it too short or it may kill running jobs that are waiting for large database selects, etc.!
If it reports corruption then you have the option to fix it (by using -Fix option). However, be warned - sometimes it 'fixes' stuff by removing it! So backup all the jobs prior to "fixing" them.
In terms of what might be causing the problems - you might be seeing a lot of "zombies" in the background that are keeping jobs open and locked. When a developer loses his connection to DataStage (computer goes into hibernation, or the wireless gets disconnected) it will leave background processes running that will have the job "locked". Since they are alive but basically dead - I call them zombies! Setting a value for the inactivity timeout in the admin client will cause those zombies to go away once they are inactive for that length of time. That will cause them to release their locks. Just don't set it too short or it may kill running jobs that are waiting for large database selects, etc.!