Page 1 of 1

How to stop the Job at once

Posted: Wed Jul 20, 2005 4:08 am
by jack_dcy
Hi ,
we want to stop a running job at once. If we want to do this with killing this job's PID in the command line interface, but we can't find the PID of this job. What can we do?
By the way,we don't want to do this in Director.

Thanks!

Re: How to stop the Job at once

Posted: Wed Jul 20, 2005 4:14 am
by benny.lbs
Try $DSHOME/bin/list_readu
jack_dcy wrote:Hi ,
we want to stop a running job at once. If we want to do this with killing this job's PID in the command line interface, but we can't find the PID of this job. What can we do?
By the way,we don't want to do this in Director.

Thanks!

Posted: Wed Jul 20, 2005 4:31 am
by ArndW
Jack_dcy,

doing a UNIX kill is not a good idea in DataStage, especially if you let slip a kill -9. Killing processes from the command line can leave locks hanging around that are not cleared (even with the uvdlockd running) and at worst might require a DataStage server restart to get going again.

Re: How to stop the Job at once

Posted: Wed Jul 20, 2005 4:43 am
by jack_dcy
benny.lbs wrote:Try $DSHOME/bin/list_readu
Thanks, but I don't know which is the running job's pid, because killing some pid ,only can logout the client. Does the running job's pid make up of the job name and the postfix '.fifo'?

Bye the way, do you know the function of $DSHOME/bin/listpid.sh?

thanks.

Posted: Wed Jul 20, 2005 4:48 am
by jack_dcy
Hi ArndW,

thanks for your advice.

Posted: Wed Jul 20, 2005 8:07 am
by benny.lbs
Seems DEADLOCK.MENU can solve this locks, right ?
ArndW wrote:Jack_dcy,

doing a UNIX kill is not a good idea in DataStage, especially if you let slip a kill -9. Killing processes from the command line can leave locks hanging around that are not cleared (even with the uvdlockd running) and at worst might require a DataStage server restart to get going again.

Posted: Wed Jul 20, 2005 8:44 am
by ArndW
Benny,

the DEADLOCK.MENU item lets you do some things with the deadlock daemon, but you need to be an Admin to access it and it should be automatically started at boot time. Usually the default 15 minutes are sufficient.

Posted: Wed Jul 20, 2005 9:17 am
by kduke

Code: Select all

ps -ef | grep phantom | grep MyJobName
This should find the PID. You can do a kill PID. DO NOT DO A kill -9 PID. This is not the recommended way to stop a job but it should work. If you are killing PIDs attached to client processes then this is not good. If you kill a DataStage Designer process then you leave locks behind. You need to clean these up. You do need to mess with the deadlock daemon or DEADLOCK.MENU unless you are killing Designer processes or they die in some other way like a firewall killing them.

DataStage runs smoothly if you do not kill these processes especially avoiding kill -9 PID. It also runs smoothly if you do not run out of disk space. You should never have to REINDEX or use DS.TOOLS unless something abnormal happens like out of disk space or a hard shutdown like the electricity gets cut off. These should never be daily or weekly processes. Something is not setup correctly if you do these things on a regular basis.

Posted: Wed Jul 20, 2005 3:11 pm
by ray.wurlod
we want to stop a running job at once

No you don't, at least not by killing "the" process. The job consists of more than one process.
There's a conductor process on the node where the job was started.
There's a section leader process on each processing node.
There's a player process associated with each uncombined operator and a player process associated with each combined set of operators, on each processing node.

If you insist on kill, you need to kill the players first (and don't use kill -9) then the section leaders and finally the conductor process.

What's wrong with using Stop request from Director or dsjob, which manages things gracefully?

Posted: Mon Aug 01, 2005 4:13 pm
by rsrikant
Ray,

In my job i have 3 parallel OCI stages extracting data and loading to a sequential file.

OCI1 ------> SEQ1

OCI2 -------> SEQ2

OCI3 -------> SEQ3

I tried stopping my job from director. It shows job is stopped.

But two of the OCI stages are still extracting data and loading in to SEQ file.

The job shows status as Stopped. It allows me to compile the job. Everything looks fine even in monitor. But still the OCI stages are extracting data. I know this for sure because the job log keeps on getting warnings. My query is fetching date wrongly and so all the records are getting warning message in two of the OCI stages. That is the reason why i stopped the job. But still the job runs in back ground and the job log is getting filled for last 2 hours.

What to do in this situation? I am afraid it might take my entire diskspace as i have 18 million records and every record gets a warning message.

How to stop the back ground process.

Thanks,
Srikanth

Posted: Mon Aug 01, 2005 5:13 pm
by chulett
Have your DBA kill the Oracle processes.

Posted: Mon Aug 01, 2005 7:47 pm
by rsrikant
Thanks Craig!

I got the problem solved.
Just killed the PIDS using the command given by kim.

ps -ef | grep phantom | grep MyJobName


Thanks kim for that. :D

But i didn't understand why stopping the job from designed didn't give me the same results and why the status is misleading me saying the job got stopped.

Any information on this will be very good to know.

Thanks,
Srikanth

Posted: Mon Aug 01, 2005 7:49 pm
by rsrikant
Oops! correction in my last post.

Code: Select all

But i didn't understand why stopping the job from designed didn't give me the same results and why the status is misleading me saying the job got stopped. 
I mean stopping the job from director didn't give the same results as killing the PIDS.

Thanks,
Srikanth

Posted: Mon Aug 01, 2005 8:53 pm
by ray.wurlod
What you do from Director is an asynchronous stop request in much the same way that a run request from Director is asynchronous; the server will get around to it. It's akin to kill -15; there is a "grace time" in which the process in question can close files, drop database connections, release locks and so on before actually shutting down. Killing the PIDs leaves all these things potentially open, which is why you have to kill Oracle processes, clean up locks held by now defunct processes and the like.
Further, since there is an hierarchy of DataStage processes you need to be carefully with kill because, if you kill the parent, a child process can turn into a zombie - and they're really hard to kill!