Page 1 of 1

Why cannot a running job be stopped

Posted: Sat Feb 19, 2005 8:53 pm
by olgc
Hi there,

When a Datastage job call a Unix script to do a DB backup, the backup process hangs up. When the backup process is killed, the job is still running. We try to stop it, but it keeps running. This process keeps trying, but just cannot stop the job. Even we use the Director Clear Resource to Logout its resources, the job is still running, and so its Controller job. At this situation, how can we stop the Datastage job? Recycle the server? That's certainly a bad choice. The OS is AIX 5.1.

Thanks,

Posted: Sat Feb 19, 2005 8:56 pm
by olgc
Even we recycle the Datastage server, the job is still running!

Posted: Sat Feb 19, 2005 10:13 pm
by ray.wurlod
The job probably isn't really running.

What the Director status view shows you is the most recently recorded status of the job, which is read from the job's RT_STATUSnnn table.

If the job was killed without the opportunity to update RT_STATUSnnn you will still be reading a status of "running". That's why, among other reasons, you should not kill DataStage processes on the server with kill -9 (which gives no opportunity to clean up).

Clearing the status file (option in Director, Job menu) will remedy the situation for you. Really, though, you should first check the processes on the server to make sure that the job really isn't running. Given that you've cycled DataStage, this is a certainty in this case!

Posted: Sun Feb 20, 2005 4:22 am
by roy
Hi,
in some extreme cases your job is waiting for something.
in this case it might not be aware of your stop attempt until a reply from the resource it waits on (i.e. sequence job waiting for a job/s to finish will only notice it was stopped after any of the jobs it run signals a change like stopped/finished and so on)
but as Ray said if you stop/start the ds server/service it is more likely in your case that only the RT_STATUSnnn wa not cleared since no process should be existing.

IHTH,

Posted: Sun Feb 20, 2005 9:48 am
by olgc
Your guys are right, thanks.

I recompile the job and get it the right status.