How to kill a Grid job

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
bobyon
Premium Member
Premium Member
Posts: 200
Joined: Tue Mar 02, 2004 10:25 am
Location: Salisbury, NC

How to kill a Grid job

Post by bobyon »

Are there any special considerations when killing a job that is runnning on a grid?

What is the proper sequence of steps to use to kill a job running on a grid that seems to be hung? (i.e. does not respond to a stop request from Director)
Bob
lstsaur
Participant
Posts: 1139
Joined: Thu Oct 21, 2004 9:59 pm

Post by lstsaur »

In a grid env., if the job is already in "HUNG" status, then you have to kill the job from the Resource Manager console or issue the "qdel" command from your Resource Manager's /bin directory. BTW, I am using PBS PRO as the Resource Manager.
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

You have to determine the nature of the hang. Is it waiting on a resource that is not available? Do you have an older version of the Grid Enablement Toolkit that didn't properly test the return code of your grid submit command?

I would suggest obtaining the latest Toolkit.

If the hang is in the "Waiting to be released from Queue" then you have to kill the grid job. You could also dummy up an _end file in your GRID_JOB_DIR path to simulate a done job, but that might cause the tool to see a valid termination rather than an abort.

I typically kill the DSD.RUN process then ensure that the grid queue is clean of that entry.

First attempt is always abort via director tool.
Post Reply