Page 1 of 1

Job failing due to (code 139)

Posted: Mon Apr 11, 2016 10:19 am
by sharmabhavesh
Hi,
One of my jobs is failing with the following error:
Parallel job reports failure (code 139)

Surprisingly, the same job ran fine in dev environment but is failing in QA environment. If I create a copy of the job and run the copy, it runs fine.

Earlier also we faced the same issue with one of the jobs and we created a copy of it and ran it job copy. I think sometimes the job log gets locked and hence we can't run that job. Since log is locked we can't even rename or delete the job.

The problem is we can't keep on creating copies of jobs if the error keeps on coming. Is there a solution anyone is aware of?

Posted: Mon Apr 11, 2016 10:37 am
by chulett
Did you do an exact search here for "Parallel job reports failure (code 139)"? There are 40 posts (not counting yours) that mention it, hopefully something in there in useful.

Posted: Mon Apr 11, 2016 10:42 am
by sharmabhavesh
Yes Craig,
One of them matched my issue and that person faced the same issue. Workaround was to create a copy of the job. But since I can't create a copy every time I encounter such error, I am seeking a solution if anyone can suggest one.

Posted: Mon Apr 11, 2016 1:12 pm
by asorrell
The project might be corrupt. You should probably run ASBNode/bin/SyncProject.sh (with -Report option) to see if the project is corrupt. You can only run that tool if project is not in use at all! This means no clients active, no jobs active.

If it reports corruption then you have the option to fix it (by using -Fix option). However, be warned - sometimes it 'fixes' stuff by removing it! So backup all the jobs prior to "fixing" them.

In terms of what might be causing the problems - you might be seeing a lot of "zombies" in the background that are keeping jobs open and locked. When a developer loses his connection to DataStage (computer goes into hibernation, or the wireless gets disconnected) it will leave background processes running that will have the job "locked". Since they are alive but basically dead - I call them zombies! Setting a value for the inactivity timeout in the admin client will cause those zombies to go away once they are inactive for that length of time. That will cause them to release their locks. Just don't set it too short or it may kill running jobs that are waiting for large database selects, etc.!