ds_ipcput() - timeout waiting for mutex

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Ricky Threlfo
Premium Member
Premium Member
Posts: 26
Joined: Mon Aug 13, 2007 8:06 pm
Location: Australia

ds_ipcput() - timeout waiting for mutex

Post by Ricky Threlfo »

Hi we are on linux rh 5, running DS server 8.5 and are gettting the following error: ds_ipcput() - timeout waiting for mutex on a merge job.

We have changed the project tunables to be: Interprocess, tried all options of buffer size and timeout. The current buffer size is 512 and timeout 60. (we have also tried larger and smaller buffer sizes, along with timeouts).

We have also set the SPINSLEEP to 20,000 - to no effect. (I have done a uvregen as well, to no effect)

Can anyone recommend what else we can do to resolve the problem?

I am not a premium member, but would really appreciate you help as this is a production issue.
Rick
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Job design?
-craig

"You can never have too many knives" -- Logan Nine Fingers
Ricky Threlfo
Premium Member
Premium Member
Posts: 26
Joined: Mon Aug 13, 2007 8:06 pm
Location: Australia

Post by Ricky Threlfo »

Hi, thanks for your reply!

The job design is 2 oracle sources feeding a transform each, then each transform feeds into a common merge stage. Finally, the merge stage feeds a transform that goes out to a target oracle stage.

The issue is fixed by populating a hashed file instead of a merge, but I have other merge jobs (that are currently working) that I am worried may exhibit the same behaviour going forward.
Rick
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

By "common merge stage" do you actually mean a Link Collector?
-craig

"You can never have too many knives" -- Logan Nine Fingers
Ricky Threlfo
Premium Member
Premium Member
Posts: 26
Joined: Mon Aug 13, 2007 8:06 pm
Location: Australia

Post by Ricky Threlfo »

yes, it's a link collector.
Rick
Ricky Threlfo
Premium Member
Premium Member
Posts: 26
Joined: Mon Aug 13, 2007 8:06 pm
Location: Australia

Post by Ricky Threlfo »

Hi Craig, and further updates on this query? I've just come back from holidays and we still seem to be having this issue in our test environment.
Rick
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I tended to avoid stages with a 'timeout' setting for precisely these reasons, at some point they will timeout. About all you can do is increase the timeout value. If that's not helping, all I could suggest after that is to contact your official support provider and see what other options you have, perhaps O/S kernel tunables... or a patch.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

What Craig said. Don't worry about the "mutex" part of the message; this is simply how semaphores are implemented on your system (MUTually EXclusive locks).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Ricky Threlfo
Premium Member
Premium Member
Posts: 26
Joined: Mon Aug 13, 2007 8:06 pm
Location: Australia

further clarification

Post by Ricky Threlfo »

Hi Craig and Ray,

I'm not quite clear on what you mean by timeout at the "stage" level, I don't adjust the timeout at the stage level, merely at job or project level. Can you kindly clarify? Increasing the timeout does help, sometimes.

Are there also settings I can tune at the back end of datastage to avoid this problem going forward?

Rgds,

Rick
Rick
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Have you tried increasing SPINTRIES (maybe doubling it)?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Ricky Threlfo
Premium Member
Premium Member
Posts: 26
Joined: Mon Aug 13, 2007 8:06 pm
Location: Australia

SPINTRIES

Post by Ricky Threlfo »

Hi, Ray

It's currently set to zero. I can't see an entry for Spinlocks (as below) - what's your recommendation? Should I set to 250?

************

# SPINTRIES - On an HP multi-processor system this
# value determines whether spin locking is used
# instead of regular semaphore locking. This value
# determines the number of attempts to obtain the
# spinlock before the process sleeps. If spinlocks
# are to be used then choose a value between 5 and.
# 500. A value of 0 switches OFF spinlocks. Misuse of
# this tunable can drastically affect system throughput.
SPINTRIES 0
Rick
Ricky Threlfo
Premium Member
Premium Member
Posts: 26
Joined: Mon Aug 13, 2007 8:06 pm
Location: Australia

SPINTRIES AND SPINLOCK combination

Post by Ricky Threlfo »

Looks like if I put an entry for SPINTRIES (say 250) I will have to put an entry for SPINLOCKS in the uvconfig as well. What, in your opinion, is a sensible combination of these 2 variables?
Rick
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

There is no entry for spinlocks. Spinlocks are a different place to wait for events - rather than just waiting on a semaphore placidly, spinlocks are a "busy wait" that keep on re-trying immediately. Doing so with many locks can chew up quantities of CPU.

By all means take alternative advice on this. Spinlocks are mainly (only?) found in HP-UX, so you probably should not use them on Linux anyway.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

As noted, SPINTRIES are for HP systems and HP-UX. I'd check with your support provider before turning it on for anything else. Also note you cannot add your own entries to uvconfig.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Ricky Threlfo
Premium Member
Premium Member
Posts: 26
Joined: Mon Aug 13, 2007 8:06 pm
Location: Australia

SPINTRIES

Post by Ricky Threlfo »

Hi Craig,

Given we are not on HP should not increase SPINTRIES now? Or should I go ahead anyway and set SPINTRIES to 250?

This timeout waiting for mutes problem seems to be occuring sporadically, and I have no means to prevent it other than removing the link collector stage and replacing it with a hash file or some equivalent.
Rick
Post Reply