Page 1 of 1

Phantom / timeout waiting for mutex error

Posted: Wed Apr 22, 2009 12:27 am
by vikibemech
Hi,

I am working in Maintainance & support project. DS jobs are getting failed often because of Phantom error & mutex error. Few jobs are not able to start running in seq level and not showing any error messages in logs but the seq will be aborted. Please advice

Re: Phatom error

Posted: Wed Apr 22, 2009 12:50 am
by sachin1
please provide us with more details on error messages you receive.

Posted: Wed Apr 22, 2009 2:28 am
by ray.wurlod
Ignore "phantom", that's just another word for a background process. Your errors are on mutex locks, probably timing out while waiting. You should find solutions by searching DSXchange.

Re: Phatom error

Posted: Wed Apr 22, 2009 5:09 am
by sharantheboss
Hi,

Mutex error mainly occures while link collector stage waits for data arrival and finally fails.increase the Timeout value in job properites and re-run it

Regards
Boss


vikibemech wrote:Hi,

I am working in Maintainance & support project. DS jobs are getting failed often because of Phantom error & mutex error. Few jobs are not able to start running in seq level and not showing any error messages in logs but the seq will be aborted. Please advice

Posted: Wed Apr 22, 2009 6:36 am
by sjfearnside
Here is information from my support provider -

"Your issue "timeout waiting for mutex" is most often caused by a timeout for your project or job that is set too low.
If you see this error, the suggested setting to start with is 300.

To set this at the Project level: From the DataStage Administrator, select Project -> Properties -> Tunables and set the timeout value to 300.

To set this at the Job level: From the DataStage Designer for the job, select Properties -> Performance and set the timeout value to 300.

Another thing to look for is database errors. In the case that you cite, you had already adjusted several parameters and you showed one of the many DB errors that occurred.

If the above adjustments to the timeout value do not help, it may be useful to begin the job and look for the first DBMS error that occurs. Sometimes these can cascade and cause errors like the mutex error."

Posted: Wed Apr 22, 2009 2:24 pm
by ray.wurlod
Search for SPINTRIES and SPINSLEEP also. Your processor may be too fast for its own good.

Posted: Fri May 08, 2009 12:11 pm
by asorrell
ray.wurlod wrote:Search for SPINTRIES and SPINSLEEP also. Your processor may be too fast for its own good.
Ray - Just FYI. I tried to pursue this with IBM - to get recommendations for settings for these tunables. I even pursued it to the level of forcing engineering to reply and this is what I got from engineering:
The InformationServer engine uses regular semaphore locking by default (ie SPINTRIES 0). Spinlocks can be used if the customer decides to enable the SPINTRIES tunable in uvconfig, but I hesitate to suggest that as a resolution for the "Timeout waiting for mutex" as we've seen this error at customer sites and the resolution has never been via a uvconfig tunable.
They only recommended using larger IPC buffers with increased timeout values - even though we were on an HP system.

Posted: Mon May 25, 2009 11:39 pm
by ret
ray.wurlod wrote:Search for SPINTRIES and SPINSLEEP also. Your processor may be too fast for its own good. ...
The uvconfig file makes references to these parameters being HP specific. It's not clear from what I've found on this problem (which we are experiencing on Solaris 10) whether they pertain to other machinery.

Having said that, the Project/Tunables timeout setting was the default value of 10, so we are trying the recommendation above of increasing that to 300.

Can anyone shed any light on whether these parameters are relevant on Solaris?

cheers
RET