PX Job not terminating

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Salegueule
Participant
Posts: 35
Joined: Fri May 21, 2004 4:22 pm

PX Job not terminating

Post by Salegueule »

Looks like a job started yesterday at 7:55 Pm yesterday do not terminate. I am seeing that it have been running all night now for almost 11 hrs. I have first try to stop it from Director. It does not do anything, it is still running. Although we seems to have kill all related process yesterday night in Unix it is still running somewhere in the background.

I have run the following from the Administrator command line: CLEAR.FILE &PH& and it is still running.

Although it might sounds a bit drastic, do you think that a DS restart or server reboot could help at this point?


Thanks
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Hell Saleguele,

Doing a CLEAR.FILE &PH& will just clear the existing logfiles, it won't stop a job from continuing to run.
It might be that your job is actually finished running (with an abort) but the Director doesn't know that. If you do a "ps -ef " grep {your-user}" do you see any processes that use the ...orch... programs?

Short of stopping the machine I think you could bring down Datastage (but don't try to re-start it until all processes using DS have stopped).

One of the options in the Director is to "clear status file" which you might be able to do. This will make the job no longer show up as "running", but it is a mistake to set this if the actual processes are still going - it is easy to chew up a lot of CPU. BTW, if you check the CPU and IO usage on your machine do you see that the system is active or inactive?
rajpatel
Participant
Posts: 14
Joined: Fri Jul 01, 2005 8:31 am

Post by rajpatel »

ArndW's way to clear status file is worderful and it will come to compiled mode once it is done.

we had same situation as you are having and we follow that route and setttled everything.

--raj
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Curiously, perhaps, a re-boot will not help in this circumstance.

A DataStage records its status periodically in a table in the Repository. So do some of its active stages and "resources". This information is read by the Director client.

When people report that a job is not terminating, it usually means that the status table (RT_STATUSnn for job number nn) has not been updated with a status of "Finished", for example.

This may be for a number of reasons, the most common being that one of the processes was killed with a SIGKILL signal, or a child process failed to notify its parent successfully. In PX jobs the child processes are player processes and their parents are section leader processes, which in turn are child processes of the conductor process.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply