I ran a job that gets in loops untill it found the killing tag ( using waitforfile function), the problem is that each time the loop started it waits for X number of seconds, by mistake the code is configured to run 60 hours!!!
Err, I can't wait all that long, I tried to use 'ps -ef' stuff in order to use "kill -9" and I could not found it in the list, also, clicking the stop button in DataStage Director is doing nothing, and releasing locks using DataStage Administrator DS.TOOLS did not do any thing (unless I did it wrongly!)?
What can I do ? did I miss any thing ( except that I ran it for 60 hours ) ?
Try to avoid the kill -9 whenever possible so that you don't have locks lying around. You should be able to find your process in ps -ef; if not then it might be really gone and all you have to do is reset the the process flags in the Administrator.
Set the next job in the sequence to 'Not Compiled' status and create the seq file so the 'wait-for-file' becomes successful and fail in the next event causing the seq to abort by itself.
ArndW wrote:Try to avoid the kill -9 whenever possible so that you don't have locks lying around. You should be able to find your process in ps -ef; if not then it might be really gone and all you have to do is reset the the process flags in the Administrator.
resetting the process flag is something new to me and I could not found any hints for it, can you please enlighten me by more details.
Sainath.Srinivasan wrote:Set the next job in the sequence to 'Not Compiled' status and create the seq file so the 'wait-for-file' becomes successful and fail in the next event causing the seq to abort by itself.
Yes ,what you said is true, but since the job is SLEEPING ( using sleep command) the 60 hours will completed even though I put the killing tag or make the child job not in runnable state.
ray.wurlod wrote:You've just discovered why it's not best practice to use long periods of SLEEP.
Re-code the job so that it wakes up occasionally to determine whether any signals or other notifications have been received.
Yes, I will be in trouble if I am using long periods of Sleep. But, this is the only way I can schedule the job to run every 2 hours!! (without using 3rd party tool)
So, I can conclude that "force Killing" is something not applicable with Ascential DS running jobs!
Sainath.Srinivasan wrote:You can do the following:
Wait-for-file for 60 mins -> if found then exit else continue
That is better idea , using SLEEP is bugging me , do you believe that sleeping 3600 seconds takes less than 20 minutes sometimes, and sometimes took unpredictable times periods, it is realy weird behaviour.
BTY, the major topic that I am trying to figure out is Killing tasks no matter what the task is doing without the need to restart the server.
Sleep and Nap uses the computer cycles to calculate the time. You can trace unix sleep.
If you intend to kill or release process, you will need to have access to something like DS.TOOLS, dssh etc, access to dsadm user and also good knowledge of who is doing what in the system. It is not recommended if you cannot trace your process in full.
* Wake every five minutes to check whether any notifications.
Now = Time()
PrevTime = Now
* Exit loop when two hours have elapsed.
ExitTime = Now + 7200
If ExitTime > 86400 Then ExitTime -= 86400
Loop
GoSub CheckNotifications
If Notified Then Exit
NextTime = Now + 300
Sleep Oconv(NextTime, "MT:") ; * SLEEP hh:mm
PrevTime = NextTime
If PrevTime <= 86400 And NextTime > 86400 Then NextTime -= 86400
While NextTime <= ExitTime
Repeat
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Sainath.Srinivasan wrote:Sleep and Nap uses the computer cycles to calculate the time. You can trace unix sleep.
If you intend to kill or release process, you will need to have access to something like DS.TOOLS, dssh etc, access to dsadm user and also good knowledge of who is doing what in the system. It is not recommended if you cannot trace your process in full.
I dont know why DS.TOOLS did not help me, I wonder if I used it wrongly
can you explain more, what the releasing locks and killing jobs features in DS.TOOLS that is useful to force kill running job (am not aware of !)?
You should be able to kill the specific process. Suppose your job name is: mySleepJob
Issue the command: ps -ef | grep mySleepJob (or any portion of the string)
If the job is actually running, you should see the process id (on the left) and the parent's process id (DataStage engine probably) on the right.
Suppose the process id is: 2329392
Issue the command: kill 2329392
Likewise you can go into the the DataStage "universe" (and let's not start the semantic argument over again that it's not really universe) and find the processes for the "user". Then you can use Universe commands to end the processes for that user.
The "end process" from Director and from DS.TOOLS, and the "UniVerse" command (MASTER OFF) all send a signal (kill -15).
If the process is not using any CPU cycles (for example if it is sleeping) it will not be able to service this signal. This is why "log out process" sometimes appears not to work.
You can wait for a while (say five minutes) to see whether the process wakes and is able to process the signal. Most folks, however, demand instant gratification, and resort to an immediate assumption that it's not going to work, then reach immediately for a larger hammer (kill -9), then wonder why they have to clean up locks, open files and other resources that the non-ignorable signal did not give the process opportunity to clean up gracefully.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ray.wurlod wrote:....... Most folks, however, demand instant gratification, and resort to an immediate assumption that it's not going to work, then reach immediately for a larger hammer (kill -9), then wonder why they have to clean up locks,......
Yah, larger hammer
I did it, with the help of biggest hammer , which is restarting the daemon.....
I met a 4+ years expert guy, he surprised why the process still printing logs (under director) where there is no resources used by this Job. Also; you can reset it and run it again, but if you want to recompile , a message popped and said " this job might be monitored" or something like that...