ds_ipcput() - timeout waiting for mutex
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 26
- Joined: Mon Aug 13, 2007 8:06 pm
- Location: Australia
ds_ipcput() - timeout waiting for mutex
Hi we are on linux rh 5, running DS server 8.5 and are gettting the following error: ds_ipcput() - timeout waiting for mutex on a merge job.
We have changed the project tunables to be: Interprocess, tried all options of buffer size and timeout. The current buffer size is 512 and timeout 60. (we have also tried larger and smaller buffer sizes, along with timeouts).
We have also set the SPINSLEEP to 20,000 - to no effect. (I have done a uvregen as well, to no effect)
Can anyone recommend what else we can do to resolve the problem?
I am not a premium member, but would really appreciate you help as this is a production issue.
We have changed the project tunables to be: Interprocess, tried all options of buffer size and timeout. The current buffer size is 512 and timeout 60. (we have also tried larger and smaller buffer sizes, along with timeouts).
We have also set the SPINSLEEP to 20,000 - to no effect. (I have done a uvregen as well, to no effect)
Can anyone recommend what else we can do to resolve the problem?
I am not a premium member, but would really appreciate you help as this is a production issue.
Rick
-
- Premium Member
- Posts: 26
- Joined: Mon Aug 13, 2007 8:06 pm
- Location: Australia
Hi, thanks for your reply!
The job design is 2 oracle sources feeding a transform each, then each transform feeds into a common merge stage. Finally, the merge stage feeds a transform that goes out to a target oracle stage.
The issue is fixed by populating a hashed file instead of a merge, but I have other merge jobs (that are currently working) that I am worried may exhibit the same behaviour going forward.
The job design is 2 oracle sources feeding a transform each, then each transform feeds into a common merge stage. Finally, the merge stage feeds a transform that goes out to a target oracle stage.
The issue is fixed by populating a hashed file instead of a merge, but I have other merge jobs (that are currently working) that I am worried may exhibit the same behaviour going forward.
Rick
-
- Premium Member
- Posts: 26
- Joined: Mon Aug 13, 2007 8:06 pm
- Location: Australia
-
- Premium Member
- Posts: 26
- Joined: Mon Aug 13, 2007 8:06 pm
- Location: Australia
I tended to avoid stages with a 'timeout' setting for precisely these reasons, at some point they will timeout. About all you can do is increase the timeout value. If that's not helping, all I could suggest after that is to contact your official support provider and see what other options you have, perhaps O/S kernel tunables... or a patch.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Premium Member
- Posts: 26
- Joined: Mon Aug 13, 2007 8:06 pm
- Location: Australia
further clarification
Hi Craig and Ray,
I'm not quite clear on what you mean by timeout at the "stage" level, I don't adjust the timeout at the stage level, merely at job or project level. Can you kindly clarify? Increasing the timeout does help, sometimes.
Are there also settings I can tune at the back end of datastage to avoid this problem going forward?
Rgds,
Rick
I'm not quite clear on what you mean by timeout at the "stage" level, I don't adjust the timeout at the stage level, merely at job or project level. Can you kindly clarify? Increasing the timeout does help, sometimes.
Are there also settings I can tune at the back end of datastage to avoid this problem going forward?
Rgds,
Rick
Rick
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Premium Member
- Posts: 26
- Joined: Mon Aug 13, 2007 8:06 pm
- Location: Australia
SPINTRIES
Hi, Ray
It's currently set to zero. I can't see an entry for Spinlocks (as below) - what's your recommendation? Should I set to 250?
************
# SPINTRIES - On an HP multi-processor system this
# value determines whether spin locking is used
# instead of regular semaphore locking. This value
# determines the number of attempts to obtain the
# spinlock before the process sleeps. If spinlocks
# are to be used then choose a value between 5 and.
# 500. A value of 0 switches OFF spinlocks. Misuse of
# this tunable can drastically affect system throughput.
SPINTRIES 0
It's currently set to zero. I can't see an entry for Spinlocks (as below) - what's your recommendation? Should I set to 250?
************
# SPINTRIES - On an HP multi-processor system this
# value determines whether spin locking is used
# instead of regular semaphore locking. This value
# determines the number of attempts to obtain the
# spinlock before the process sleeps. If spinlocks
# are to be used then choose a value between 5 and.
# 500. A value of 0 switches OFF spinlocks. Misuse of
# this tunable can drastically affect system throughput.
SPINTRIES 0
Rick
-
- Premium Member
- Posts: 26
- Joined: Mon Aug 13, 2007 8:06 pm
- Location: Australia
SPINTRIES AND SPINLOCK combination
Looks like if I put an entry for SPINTRIES (say 250) I will have to put an entry for SPINLOCKS in the uvconfig as well. What, in your opinion, is a sensible combination of these 2 variables?
Rick
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
There is no entry for spinlocks. Spinlocks are a different place to wait for events - rather than just waiting on a semaphore placidly, spinlocks are a "busy wait" that keep on re-trying immediately. Doing so with many locks can chew up quantities of CPU.
By all means take alternative advice on this. Spinlocks are mainly (only?) found in HP-UX, so you probably should not use them on Linux anyway.
By all means take alternative advice on this. Spinlocks are mainly (only?) found in HP-UX, so you probably should not use them on Linux anyway.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 26
- Joined: Mon Aug 13, 2007 8:06 pm
- Location: Australia
SPINTRIES
Hi Craig,
Given we are not on HP should not increase SPINTRIES now? Or should I go ahead anyway and set SPINTRIES to 250?
This timeout waiting for mutes problem seems to be occuring sporadically, and I have no means to prevent it other than removing the link collector stage and replacing it with a hash file or some equivalent.
Given we are not on HP should not increase SPINTRIES now? Or should I go ahead anyway and set SPINTRIES to 250?
This timeout waiting for mutes problem seems to be occuring sporadically, and I have no means to prevent it other than removing the link collector stage and replacing it with a hash file or some equivalent.
Rick