Page 1 of 2

ds_ipcput() - timeout waiting for mutex

Posted: Wed Dec 05, 2012 5:38 pm
by Ricky Threlfo
Hi we are on linux rh 5, running DS server 8.5 and are gettting the following error: ds_ipcput() - timeout waiting for mutex on a merge job.

We have changed the project tunables to be: Interprocess, tried all options of buffer size and timeout. The current buffer size is 512 and timeout 60. (we have also tried larger and smaller buffer sizes, along with timeouts).

We have also set the SPINSLEEP to 20,000 - to no effect. (I have done a uvregen as well, to no effect)

Can anyone recommend what else we can do to resolve the problem?

I am not a premium member, but would really appreciate you help as this is a production issue.

Posted: Wed Dec 05, 2012 10:20 pm
by chulett
Job design?

Posted: Thu Dec 06, 2012 4:04 pm
by Ricky Threlfo
Hi, thanks for your reply!

The job design is 2 oracle sources feeding a transform each, then each transform feeds into a common merge stage. Finally, the merge stage feeds a transform that goes out to a target oracle stage.

The issue is fixed by populating a hashed file instead of a merge, but I have other merge jobs (that are currently working) that I am worried may exhibit the same behaviour going forward.

Posted: Thu Dec 06, 2012 5:14 pm
by chulett
By "common merge stage" do you actually mean a Link Collector?

Posted: Thu Dec 06, 2012 11:41 pm
by Ricky Threlfo
yes, it's a link collector.

Posted: Sun Jan 06, 2013 3:15 pm
by Ricky Threlfo
Hi Craig, and further updates on this query? I've just come back from holidays and we still seem to be having this issue in our test environment.

Posted: Sun Jan 06, 2013 4:18 pm
by chulett
I tended to avoid stages with a 'timeout' setting for precisely these reasons, at some point they will timeout. About all you can do is increase the timeout value. If that's not helping, all I could suggest after that is to contact your official support provider and see what other options you have, perhaps O/S kernel tunables... or a patch.

Posted: Sun Jan 06, 2013 6:41 pm
by ray.wurlod
What Craig said. Don't worry about the "mutex" part of the message; this is simply how semaphores are implemented on your system (MUTually EXclusive locks).

further clarification

Posted: Mon Jan 07, 2013 2:27 pm
by Ricky Threlfo
Hi Craig and Ray,

I'm not quite clear on what you mean by timeout at the "stage" level, I don't adjust the timeout at the stage level, merely at job or project level. Can you kindly clarify? Increasing the timeout does help, sometimes.

Are there also settings I can tune at the back end of datastage to avoid this problem going forward?

Rgds,

Rick

Posted: Mon Jan 07, 2013 3:45 pm
by ray.wurlod
Have you tried increasing SPINTRIES (maybe doubling it)?

SPINTRIES

Posted: Mon Jan 07, 2013 9:16 pm
by Ricky Threlfo
Hi, Ray

It's currently set to zero. I can't see an entry for Spinlocks (as below) - what's your recommendation? Should I set to 250?

************

# SPINTRIES - On an HP multi-processor system this
# value determines whether spin locking is used
# instead of regular semaphore locking. This value
# determines the number of attempts to obtain the
# spinlock before the process sleeps. If spinlocks
# are to be used then choose a value between 5 and.
# 500. A value of 0 switches OFF spinlocks. Misuse of
# this tunable can drastically affect system throughput.
SPINTRIES 0

SPINTRIES AND SPINLOCK combination

Posted: Mon Jan 07, 2013 9:23 pm
by Ricky Threlfo
Looks like if I put an entry for SPINTRIES (say 250) I will have to put an entry for SPINLOCKS in the uvconfig as well. What, in your opinion, is a sensible combination of these 2 variables?

Posted: Mon Jan 07, 2013 9:55 pm
by ray.wurlod
There is no entry for spinlocks. Spinlocks are a different place to wait for events - rather than just waiting on a semaphore placidly, spinlocks are a "busy wait" that keep on re-trying immediately. Doing so with many locks can chew up quantities of CPU.

By all means take alternative advice on this. Spinlocks are mainly (only?) found in HP-UX, so you probably should not use them on Linux anyway.

Posted: Tue Jan 08, 2013 7:33 am
by chulett
As noted, SPINTRIES are for HP systems and HP-UX. I'd check with your support provider before turning it on for anything else. Also note you cannot add your own entries to uvconfig.

SPINTRIES

Posted: Tue Jan 08, 2013 2:43 pm
by Ricky Threlfo
Hi Craig,

Given we are not on HP should not increase SPINTRIES now? Or should I go ahead anyway and set SPINTRIES to 250?

This timeout waiting for mutes problem seems to be occuring sporadically, and I have no means to prevent it other than removing the link collector stage and replacing it with a hash file or some equivalent.