Page 1 of 1

Mutex Timeouts

Posted: Mon Dec 04, 2006 4:48 pm
by jpr196
Hi,

We have delivered jobs with Error Tables that abort the job due to Mutex Timeouts. A few of the jobs in their simplest form look like this:

src --> ipc ---> trans ---> ipc ---> tgt ---> hash

The error table comes off the transformer without an ipc and there are some hash lookups attached to the transformer as well.

When we take the error validation off, the job runs fine. I was also able to fix one of the jobs by adding an ipc b/w the transformer and error table.
However, we don't believe this is the best solution for this problem. (Customizing all the deliver jobs with an ipc)

Has anyone seen this problem before and might have a solution? We have a feeling it might not be enough memory being allocated on the server or database?

Posted: Mon Dec 04, 2006 4:59 pm
by ray.wurlod
Maybe your computer is too fast! It could be exhausting the number of retries before the timeout kicks in.

Search the Forum for information about the SPINTRIES and SPINSLEEP configuration variables that can be used to tune mutex lock behaviour. You might also increase the timeout on your IPC stages.

Posted: Tue Dec 05, 2006 7:28 am
by Nagin
ray.wurlod wrote:Maybe your computer is too fast! It could be exhausting the number of retries before the timeout kicks in.

Search the Forum for information about the SPINTRIES and SPINSLEEP configuration variables that can be used to tune mutex lock behaviour. You might also increase the timeout on your IPC stages.
Ray,
I got this error before and when recompiled the job and ran after couple of hours, it seemed like it was working. But I really wanted to see tuning that mutex lock behaviour and tried the search on SPINTRIES and SPINSLEEP like you said, it did not give me any results at all. Any other recommendation on how to find this?

Thanks.

Posted: Tue Dec 05, 2006 7:33 am
by Nagin
Nagin wrote:
ray.wurlod wrote:Maybe your computer is too fast! It could be exhausting the number of retries before the timeout kicks in.

Search the Forum for information about the SPINTRIES and SPINSLEEP configuration variables that can be used to tune mutex lock behaviour. You might also increase the timeout on your IPC stages.
Ray,
I got this error before and when recompiled the job and ran after couple of hours, it seemed like it was working. But I really wanted to see tuning that mutex lock behaviour and tried the search on SPINTRIES and SPINSLEEP like you said, it did not give me any results at all. Any other recommendation on how to find this?

Thanks.

My bad, I found it in the exact word search. :oops:

Posted: Tue Dec 05, 2006 8:41 am
by jpr196
Hi Ray,

Thanks for the suggestion. I did a search on the forum and this very well may be our problem. Our settings for spintries and sleep are 0 and 5000 respectively. Would this mean spintries is turned off and shouldn't impact us or should we try to increase this number from 0?
Thanks for the help.

Posted: Tue Dec 05, 2006 6:52 pm
by ray.wurlod
No, 0 means unlimited. Slightly increasing SPINSLEEP (maybe to 6000) will mean that the wait between retries is extended, so that the likelihood of hitting the timeout is reduced.

Changes do not take effect until uvregen has been executed and services restarted.