Jobs intermittently hanging when starting

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
johnmwilliams
Premium Member
Premium Member
Posts: 5
Joined: Sun Jul 31, 2005 10:43 pm

Jobs intermittently hanging when starting

Post by johnmwilliams »

Hi,

I'm new to DS. This is my first post/question to the list, but I've been lurking for a little while and have already picked up some useful tips (thanx).

I'm working in DS 4 on Unix. I have some jobs which in turn invoke other jobs using DSRunJob. The invoked jobs generally work as expected, but every so often will start up and stay in the 'starting' phase, according to the monitor, and not proceed any further. After aborting and resetting the job, on re-run everything works, unless it hangs on another job.

I've enabled tracing and have identified the point at which it hangs:

COMO DSRTRACE_dstage-21489 established 17:04:49 11 AUG 2005
2005-08-11 17:04:49: Initialised on 17:04:49 11 AUG 2005
2005-08-11 17:04:49: DSR_MESSAGE =DataStage Job 38 Phantom 21490
DataStage Job 38 Phantom 21490

When it hangs it never goes past this point, and the Unix process continues ad infinitum as well A succesful run would continue with:

xxx.yyy.com.au
2005-08-11 17:04:52: DSR_MESSAGE =DSGetProjectInfo called: InfoType = 2
2005-08-11 17:04:52: DSR_MESSAGE =DSGetProjectInfo: Returned Result = test
/hta/data/data76/ds_projects/test
/hta/data/data76/ds_projects/test
xxx.yyy.com.au
2005-08-11 17:04:52: DSR_MESSAGE =DSGetProjectInfo called: InfoType = 2
2005-08-11 17:04:52: DSR_MESSAGE =DSGetProjectInfo: Returned Result = test
2005-08-11 17:04:56: DSR_MESSAGE =5 rows read from SERVICE
2005-08-11 17:04:56: DSR_MESSAGE =5 rows written to SERVICE
2005-08-11 17:04:56: DSR_MESSAGE =DataStage Phantom Finished.
DataStage Phantom Finished.

It looks like the call to DSGetProjectInfo either never returns or never runs in the first place.

Any suggestions would be greatly appreciated. The installation has been running successfully for quite a while, but this is a new piece of work altogether.

Mnay thanks,
John Williams
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard! :D

In the job control routine, immediately after DSAttachJob, add another innocuous function call that uses the job handle. For example

Code: Select all

RealName = DSGetJobInfo(hJob, DSJ.JOBNAME)
Replace the DSWaitForJob call with a "busy wait" - a loop that periodically checks the job's status then sleeps for a while, and exits once it's finished. Code in an additional way that the loop can be commanded to exit, for example executing a UNIX command to detect the presence or absence of a file.

Within that loop, log messages as to the status of the job. If it's still "starting" after - say - 30 seconds, obtain every bit of information that you can imagine and write it to the log for analysis.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
johnmwilliams
Premium Member
Premium Member
Posts: 5
Joined: Sun Jul 31, 2005 10:43 pm

Jobs intermittently hanging when starting

Post by johnmwilliams »

Hi Ray,

Thanks for the ideas. I've not made a lot of progress towards identifying the underlying cause, but your debugging hints are helping me find my way round it.

In the meantime the problem appeared to get worse, so we restarted the server. This improved matters vastly, but we have still experienced the problem once since the restart.

Cheers,
John Williams
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Re: Jobs intermittently hanging when starting

Post by ogmios »

John,

Maybe check with Ascential... we had the same problem when we were on v4.x. I belief they had us put a shared library (lpw or so) in front of the others in the dsenv file which made a huge difference. We did spend a couple of months on it.

Unfortunately I don't recall anymore exactly what it was, and since then moved on to v6 and v7... :wink: will check tomorrow whether I can still find something

Ogmios
In theory there's no difference between theory and practice. In practice there is.
Post Reply