Job failing due to (code 139)

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
sharmabhavesh
Premium Member
Premium Member
Posts: 38
Joined: Tue Jun 19, 2012 11:03 pm
Location: India

Job failing due to (code 139)

Post by sharmabhavesh »

Hi,
One of my jobs is failing with the following error:
Parallel job reports failure (code 139)

Surprisingly, the same job ran fine in dev environment but is failing in QA environment. If I create a copy of the job and run the copy, it runs fine.

Earlier also we faced the same issue with one of the jobs and we created a copy of it and ran it job copy. I think sometimes the job log gets locked and hence we can't run that job. Since log is locked we can't even rename or delete the job.

The problem is we can't keep on creating copies of jobs if the error keeps on coming. Is there a solution anyone is aware of?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Did you do an exact search here for "Parallel job reports failure (code 139)"? There are 40 posts (not counting yours) that mention it, hopefully something in there in useful.
-craig

"You can never have too many knives" -- Logan Nine Fingers
sharmabhavesh
Premium Member
Premium Member
Posts: 38
Joined: Tue Jun 19, 2012 11:03 pm
Location: India

Post by sharmabhavesh »

Yes Craig,
One of them matched my issue and that person faced the same issue. Workaround was to create a copy of the job. But since I can't create a copy every time I encounter such error, I am seeking a solution if anyone can suggest one.
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

The project might be corrupt. You should probably run ASBNode/bin/SyncProject.sh (with -Report option) to see if the project is corrupt. You can only run that tool if project is not in use at all! This means no clients active, no jobs active.

If it reports corruption then you have the option to fix it (by using -Fix option). However, be warned - sometimes it 'fixes' stuff by removing it! So backup all the jobs prior to "fixing" them.

In terms of what might be causing the problems - you might be seeing a lot of "zombies" in the background that are keeping jobs open and locked. When a developer loses his connection to DataStage (computer goes into hibernation, or the wireless gets disconnected) it will leave background processes running that will have the job "locked". Since they are alive but basically dead - I call them zombies! Setting a value for the inactivity timeout in the admin client will cause those zombies to go away once they are inactive for that length of time. That will cause them to release their locks. Just don't set it too short or it may kill running jobs that are waiting for large database selects, etc.!
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
Post Reply