Time out warnings

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
GY768
Participant
Posts: 14
Joined: Fri Oct 28, 2005 8:12 am

Time out warnings

Post by GY768 »

Can anyone help with the following:

I have been executing a run Seq. Jobs containing pivot tables within the seq are producing many files (over a million) and datastage produces the below error. I have investigated all possible areas for this and have failed to come up with a solution.

It seems the time out occurs because the jobs are in the aborted state, however the action within the seq is to reset and job if required.

Can anyone help?

run_DWH_ELEMENT..JobControl (@Seq_Xfm_Lnd_Pld_DWH_ELEMENT_QUOTE):
Controller problem: Error calling
DSRunJob(Seq_Xfm_Lnd_Pld_DWH_ELEMENT_QUOTE), code=-14 [Timed out while waiting for an event]
<b>Philip Morris</b>
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Your node that holds the DSEngine is probably slammed, your job control is timing out in its request to run a job.

Think of it this way, the API sends a message to the engine process to please start the job, the engine doesn't acknowledge the message, the API times out. The only reason the engine process didn't respond is because the node is overwhelmed.

Your message is from the DSRunJob API, the code is the timeout message. If you're on Solaris, run prstat from a telnet session and monitor the node load. If on AIX or HP/UX, get your hands on top or glance to see the same load measures. You'll need to talk to Ascential tech support about any patches that might mitigate this issue.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
track_star
Participant
Posts: 60
Joined: Sat Jan 24, 2004 12:52 pm
Location: Mount Carmel, IL

Post by track_star »

Are you running multiple instance jobs? Also, how many jobs are running concurrently when you get the error? There are a few settings in the uvconfig that might alleviate the issue.
GY768
Participant
Posts: 14
Joined: Fri Oct 28, 2005 8:12 am

Multiple Instance

Post by GY768 »

Yes, at any one time on average there are between 12 - 15 multiple instance jobs running in the seq.
<b>Philip Morris</b>
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

The multiple instances jobs aren't the issue, the load on the node is the issue. I can have 100's of very resource-light jobs running simultaneously without issue, but 1 intensive job can use all resources.

You neeeed to monitor server load. Every single DS developer needs to get into the habit of having a top, prstat, glance, whatever tool running all of the time. It's even better to have a single tool gathering the information continuously 24x7x365 and make it available for everyone.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
track_star
Participant
Posts: 60
Joined: Sat Jan 24, 2004 12:52 pm
Location: Mount Carmel, IL

Post by track_star »

GY768, can you post the values from the following entries in your uvconfig file:

RLTABSZ
GLTABSZ
MAXRLOCK
UVSYNC

The file is in DSEngine.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Not sure where you're going with that, but without GSEMNUM they're not particularly useful. Surely if collisions on locks in the Repository were the problem, a report from SEMAPHORE.STATUS would be a useful thing? It might also be worth creating $DSHOME/errlog to capture any Engine errors.

Code: Select all

touch $DSHOME/errlog
chmod 777 $DSHOME/errlog
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
bkarth
Premium Member
Premium Member
Posts: 9
Joined: Wed Oct 26, 2005 2:04 pm

We are getting the same problem on a Win 2003 Server

Post by bkarth »

Hello,

We are getting the same error on a Win 2003 Server.

RLTABSZ = 75
GLTABSZ = 75
MAXRLOCK = 74
UVSYNC - 0

Is there any patch for this? All these jobs are very very simple ones and it shouldn't take any resource. I am not sure what is causing this issue.

DS Version is 7.5x2 (Server Job Sequence)

Thanks,
Karthik
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What else is happening on the server? I recall one site who ran DS on their Primary Domain Controller (!) - this error occurred many times!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply