The Job is runnning and I can't stop it

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
alraaayeq
Participant
Posts: 35
Joined: Sun Apr 04, 2004 5:57 am
Location: Riyadh,Saudi Arabia

The Job is runnning and I can't stop it

Post by alraaayeq »

Hi,


I am wondering how to "force kill" a running job?


I ran a job that gets in loops untill it found the killing tag ( using waitforfile function), the problem is that each time the loop started it waits for X number of seconds, by mistake the code is configured to run 60 hours!!!


Err, I can't wait all that long, I tried to use 'ps -ef' stuff in order to use "kill -9" and I could not found it in the list, also, clicking the stop button in DataStage Director is doing nothing, and releasing locks using DataStage Administrator DS.TOOLS did not do any thing (unless I did it wrongly!)?



What can I do ? did I miss any thing ( except that I ran it for 60 hours ;) ) ?
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Try to avoid the kill -9 whenever possible so that you don't have locks lying around. You should be able to find your process in ps -ef; if not then it might be really gone and all you have to do is reset the the process flags in the Administrator.
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Set the next job in the sequence to 'Not Compiled' status and create the seq file so the 'wait-for-file' becomes successful and fail in the next event causing the seq to abort by itself.
alraaayeq
Participant
Posts: 35
Joined: Sun Apr 04, 2004 5:57 am
Location: Riyadh,Saudi Arabia

Post by alraaayeq »

ArndW wrote:Try to avoid the kill -9 whenever possible so that you don't have locks lying around. You should be able to find your process in ps -ef; if not then it might be really gone and all you have to do is reset the the process flags in the Administrator.
resetting the process flag is something new to me and I could not found any hints for it, can you please enlighten me by more details.
alraaayeq
Participant
Posts: 35
Joined: Sun Apr 04, 2004 5:57 am
Location: Riyadh,Saudi Arabia

Post by alraaayeq »

Sainath.Srinivasan wrote:Set the next job in the sequence to 'Not Compiled' status and create the seq file so the 'wait-for-file' becomes successful and fail in the next event causing the seq to abort by itself.
Yes ,what you said is true, but since the job is SLEEPING ( using sleep command) the 60 hours will completed even though I put the killing tag or make the child job not in runnable state.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You've just discovered why it's not best practice to use long periods of SLEEP.

Re-code the job so that it wakes up occasionally to determine whether any signals or other notifications have been received.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
alraaayeq
Participant
Posts: 35
Joined: Sun Apr 04, 2004 5:57 am
Location: Riyadh,Saudi Arabia

Post by alraaayeq »

ray.wurlod wrote:You've just discovered why it's not best practice to use long periods of SLEEP.

Re-code the job so that it wakes up occasionally to determine whether any signals or other notifications have been received.
Yes, I will be in trouble if I am using long periods of Sleep. But, this is the only way I can schedule the job to run every 2 hours!! (without using 3rd party tool)

So, I can conclude that "force Killing" is something not applicable with Ascential DS running jobs!
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

You can do the following:
Wait-for-file for 60 mins -> if found then exit else continue
alraaayeq
Participant
Posts: 35
Joined: Sun Apr 04, 2004 5:57 am
Location: Riyadh,Saudi Arabia

Post by alraaayeq »

Sainath.Srinivasan wrote:You can do the following:
Wait-for-file for 60 mins -> if found then exit else continue

That is better idea :shock:, using SLEEP is bugging me , do you believe that sleeping 3600 seconds takes less than 20 minutes sometimes, and sometimes took unpredictable times periods, it is realy weird behaviour.



BTY, the major topic that I am trying to figure out is Killing tasks no matter what the task is doing without the need to restart the server. :?: :?:
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

Sleep and Nap uses the computer cycles to calculate the time. You can trace unix sleep.

If you intend to kill or release process, you will need to have access to something like DS.TOOLS, dssh etc, access to dsadm user and also good knowledge of who is doing what in the system. It is not recommended if you cannot trace your process in full.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There's a second form of the SLEEP statement, that sleeps until a particular time. Note the following code fragment.

Code: Select all

* Wake every five minutes to check whether any notifications.
Now = Time()
PrevTime = Now

* Exit loop when two hours have elapsed.
ExitTime = Now + 7200
If ExitTime > 86400 Then ExitTime -= 86400

Loop
   GoSub CheckNotifications
   If Notified Then Exit
   NextTime = Now + 300
   Sleep Oconv(NextTime, "MT:")  ; * SLEEP hh:mm
   PrevTime = NextTime
   If PrevTime <= 86400 And NextTime > 86400 Then NextTime -= 86400
While NextTime <= ExitTime
Repeat
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
alraaayeq
Participant
Posts: 35
Joined: Sun Apr 04, 2004 5:57 am
Location: Riyadh,Saudi Arabia

Post by alraaayeq »

Sainath.Srinivasan wrote:Sleep and Nap uses the computer cycles to calculate the time. You can trace unix sleep.

If you intend to kill or release process, you will need to have access to something like DS.TOOLS, dssh etc, access to dsadm user and also good knowledge of who is doing what in the system. It is not recommended if you cannot trace your process in full.
I dont know why DS.TOOLS did not help me, I wonder if I used it wrongly
:?
can you explain more, what the releasing locks and killing jobs features in DS.TOOLS that is useful to force kill running job (am not aware of !)?
newtier
Premium Member
Premium Member
Posts: 27
Joined: Mon Dec 13, 2004 5:50 pm
Location: St. Louis, MO

Post by newtier »

You should be able to kill the specific process. Suppose your job name is: mySleepJob

Issue the command: ps -ef | grep mySleepJob (or any portion of the string)

If the job is actually running, you should see the process id (on the left) and the parent's process id (DataStage engine probably) on the right.

Suppose the process id is: 2329392

Issue the command: kill 2329392

Likewise you can go into the the DataStage "universe" (and let's not start the semantic argument over again that it's not really universe) and find the processes for the "user". Then you can use Universe commands to end the processes for that user.
Rick H
Senior Consultant
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The "end process" from Director and from DS.TOOLS, and the "UniVerse" command (MASTER OFF) all send a signal (kill -15).

If the process is not using any CPU cycles (for example if it is sleeping) it will not be able to service this signal. This is why "log out process" sometimes appears not to work.

You can wait for a while (say five minutes) to see whether the process wakes and is able to process the signal. Most folks, however, demand instant gratification, and resort to an immediate assumption that it's not going to work, then reach immediately for a larger hammer (kill -9), then wonder why they have to clean up locks, open files and other resources that the non-ignorable signal did not give the process opportunity to clean up gracefully.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
alraaayeq
Participant
Posts: 35
Joined: Sun Apr 04, 2004 5:57 am
Location: Riyadh,Saudi Arabia

Post by alraaayeq »

ray.wurlod wrote:....... Most folks, however, demand instant gratification, and resort to an immediate assumption that it's not going to work, then reach immediately for a larger hammer (kill -9), then wonder why they have to clean up locks,......
Yah, larger hammer :twisted:

I did it, with the help of biggest hammer , which is restarting the daemon..... :cry:

I met a 4+ years expert guy, he surprised why the process still printing logs (under director) where there is no resources used by this Job. Also; you can reset it and run it again, but if you want to recompile , a message popped and said " this job might be monitored" or something like that...


That's what I call " frustration "
Post Reply