How to stop the Job at once

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
jack_dcy
Participant
Posts: 18
Joined: Wed Jun 29, 2005 9:53 pm

How to stop the Job at once

Post by jack_dcy »

Hi ,
we want to stop a running job at once. If we want to do this with killing this job's PID in the command line interface, but we can't find the PID of this job. What can we do?
By the way,we don't want to do this in Director.

Thanks!
benny.lbs
Participant
Posts: 125
Joined: Wed Feb 23, 2005 3:46 am

Re: How to stop the Job at once

Post by benny.lbs »

Try $DSHOME/bin/list_readu
jack_dcy wrote:Hi ,
we want to stop a running job at once. If we want to do this with killing this job's PID in the command line interface, but we can't find the PID of this job. What can we do?
By the way,we don't want to do this in Director.

Thanks!
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Jack_dcy,

doing a UNIX kill is not a good idea in DataStage, especially if you let slip a kill -9. Killing processes from the command line can leave locks hanging around that are not cleared (even with the uvdlockd running) and at worst might require a DataStage server restart to get going again.
jack_dcy
Participant
Posts: 18
Joined: Wed Jun 29, 2005 9:53 pm

Re: How to stop the Job at once

Post by jack_dcy »

benny.lbs wrote:Try $DSHOME/bin/list_readu
Thanks, but I don't know which is the running job's pid, because killing some pid ,only can logout the client. Does the running job's pid make up of the job name and the postfix '.fifo'?

Bye the way, do you know the function of $DSHOME/bin/listpid.sh?

thanks.
jack_dcy
Participant
Posts: 18
Joined: Wed Jun 29, 2005 9:53 pm

Post by jack_dcy »

Hi ArndW,

thanks for your advice.
benny.lbs
Participant
Posts: 125
Joined: Wed Feb 23, 2005 3:46 am

Post by benny.lbs »

Seems DEADLOCK.MENU can solve this locks, right ?
ArndW wrote:Jack_dcy,

doing a UNIX kill is not a good idea in DataStage, especially if you let slip a kill -9. Killing processes from the command line can leave locks hanging around that are not cleared (even with the uvdlockd running) and at worst might require a DataStage server restart to get going again.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Benny,

the DEADLOCK.MENU item lets you do some things with the deadlock daemon, but you need to be an Admin to access it and it should be automatically started at boot time. Usually the default 15 minutes are sufficient.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Code: Select all

ps -ef | grep phantom | grep MyJobName
This should find the PID. You can do a kill PID. DO NOT DO A kill -9 PID. This is not the recommended way to stop a job but it should work. If you are killing PIDs attached to client processes then this is not good. If you kill a DataStage Designer process then you leave locks behind. You need to clean these up. You do need to mess with the deadlock daemon or DEADLOCK.MENU unless you are killing Designer processes or they die in some other way like a firewall killing them.

DataStage runs smoothly if you do not kill these processes especially avoiding kill -9 PID. It also runs smoothly if you do not run out of disk space. You should never have to REINDEX or use DS.TOOLS unless something abnormal happens like out of disk space or a hard shutdown like the electricity gets cut off. These should never be daily or weekly processes. Something is not setup correctly if you do these things on a regular basis.
Mamu Kim
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

we want to stop a running job at once

No you don't, at least not by killing "the" process. The job consists of more than one process.
There's a conductor process on the node where the job was started.
There's a section leader process on each processing node.
There's a player process associated with each uncombined operator and a player process associated with each combined set of operators, on each processing node.

If you insist on kill, you need to kill the players first (and don't use kill -9) then the section leaders and finally the conductor process.

What's wrong with using Stop request from Director or dsjob, which manages things gracefully?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
rsrikant
Participant
Posts: 58
Joined: Sat Feb 28, 2004 12:35 am
Location: Silver Spring, MD

Post by rsrikant »

Ray,

In my job i have 3 parallel OCI stages extracting data and loading to a sequential file.

OCI1 ------> SEQ1

OCI2 -------> SEQ2

OCI3 -------> SEQ3

I tried stopping my job from director. It shows job is stopped.

But two of the OCI stages are still extracting data and loading in to SEQ file.

The job shows status as Stopped. It allows me to compile the job. Everything looks fine even in monitor. But still the OCI stages are extracting data. I know this for sure because the job log keeps on getting warnings. My query is fetching date wrongly and so all the records are getting warning message in two of the OCI stages. That is the reason why i stopped the job. But still the job runs in back ground and the job log is getting filled for last 2 hours.

What to do in this situation? I am afraid it might take my entire diskspace as i have 18 million records and every record gets a warning message.

How to stop the back ground process.

Thanks,
Srikanth
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Have your DBA kill the Oracle processes.
-craig

"You can never have too many knives" -- Logan Nine Fingers
rsrikant
Participant
Posts: 58
Joined: Sat Feb 28, 2004 12:35 am
Location: Silver Spring, MD

Post by rsrikant »

Thanks Craig!

I got the problem solved.
Just killed the PIDS using the command given by kim.

ps -ef | grep phantom | grep MyJobName


Thanks kim for that. :D

But i didn't understand why stopping the job from designed didn't give me the same results and why the status is misleading me saying the job got stopped.

Any information on this will be very good to know.

Thanks,
Srikanth
rsrikant
Participant
Posts: 58
Joined: Sat Feb 28, 2004 12:35 am
Location: Silver Spring, MD

Post by rsrikant »

Oops! correction in my last post.

Code: Select all

But i didn't understand why stopping the job from designed didn't give me the same results and why the status is misleading me saying the job got stopped. 
I mean stopping the job from director didn't give the same results as killing the PIDS.

Thanks,
Srikanth
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What you do from Director is an asynchronous stop request in much the same way that a run request from Director is asynchronous; the server will get around to it. It's akin to kill -15; there is a "grace time" in which the process in question can close files, drop database connections, release locks and so on before actually shutting down. Killing the PIDs leaves all these things potentially open, which is why you have to kill Oracle processes, clean up locks held by now defunct processes and the like.
Further, since there is an hierarchy of DataStage processes you need to be carefully with kill because, if you kill the parent, a child process can turn into a zombie - and they're really hard to kill!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply