Page 1 of 1

DataStage job log and control job inconsistancies

Posted: Tue May 07, 2002 2:53 am
by nigel
Hi All

I'm using DS 4.0xx. I am experiencing errors where a control/batch job "hangs" because it has not received a completion response from a child /controlled job.
The child job completes normally but the control job still hangs. Sometimes the child job hangs, but the log entries indicate that the job has completed. This error is not specific to the abovementioned jobs as it occurs with other jobs/control jobs as well.

Any ideas (other than upgrading DS)

Regards

Nigel

Nigel

Posted: Wed May 08, 2002 12:34 am
by Starg
Nigel, hello again :-)

We experience exactly the same problem on DS 5.1/Solaris. Tech support have told us that there are problems with Signal handling in Solaris 2.8. Not too sure on this, but it does cause us the occasional problem with jobs hanging.

Tech support suggested turning Notify Off as it may help (it did not help us, but your welcome to try). If your NOtify is OFF then try turning it ON.

From the Universe prompt type :
-- Start Quote --
CT VOC LOGIN

You will have an entry NOTIFY OFF or NOTIFY ON. Here's what NOTIFY refers to:

Use NOTIFY to specify whether to display messages from phantom
processes instantly or when UniVerse displays the next system
prompt. When you enter UniVerse, NOTIFY OFF is in effect.

ON Messages appear immediately.

OFF Messages appear when UniVerse displays the next system prompt ( > ).

To change NOTIFY ON to OFF commands:

1) ED VOC LOGIN (opens the file to edit)
2) P (prints all the lines in the file)
3) 3 (to move to line 3, or whatever number you NOTIFY statement is on)
4) C/ON/OFF (to change ON to OFF)
4) FI (to save changes)

Posted: Wed May 08, 2002 1:12 am
by nigel
Thanks for the update.

We are also using Ds on Solaris.
I will implement the NOTIFY change and see what happens.

I found another archive entry on this site regarding the same problem.
It is as follows:

Use an active wait techinque in job control to replace the DSWaitForJob routine.
The controlling job runs (sleeping most of the time) periodically checking the status of the job under control.
example:
Loop
JobStatus = DSGetJobInfo(hJob,DSJ.JOBSTATUS)
While JobStatus is DSJ.RUNNING
Sleep 60 ; * sleep for a minute
Repeat

Nigel