Problem using Link Partitioner and Hash lookup

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
ak77
Charter Member
Charter Member
Posts: 70
Joined: Thu Jun 23, 2005 5:47 pm
Location: Oklahoma

Problem using Link Partitioner and Hash lookup

Post by ak77 »

Hello Everybody,

This job ran and is running perfectly when i dont use Link partioner and colector

The job is pretty simple
I am selecting data from Oracle using OCI plug-in. I am writing a user-defined query to join three tables and looking up against a Hash file

This worked perfectly so i thought of using the Link partioner and collector to improve performance but at one point there is no data from the OCI stage and waited for a long while and then stopped the job

When I stopped the job, i got this error message
1. the output link from the link partioner to the transformer
2. the output from the tranformer to the link collecter

"ds_ipcgetnext - timeout waiting for mutex"

Thanks for the help

Regards,
Kishan
logic
Participant
Posts: 115
Joined: Thu Feb 24, 2005 10:48 am

Post by logic »

Hi Ak77,
The stages that you are using create a separate process for themselves. So the problem would be with the memory buffer. You can adjust the buffer size. Exceeding the timeout will cause the mutex error.
hope this helps.
ak77
Charter Member
Charter Member
Posts: 70
Joined: Thu Jun 23, 2005 5:47 pm
Location: Oklahoma

Lemme change the buffer size and see

Post by ak77 »

Thanks for the immediate reply,

The buffer size 128Kb
Timeout 10 sec
Array size in OCI plug-in 500

I will try to increase the buffer size and see if that helps

Thanks again

Regards,
Kishan
ak77
Charter Member
Charter Member
Posts: 70
Joined: Thu Jun 23, 2005 5:47 pm
Location: Oklahoma

Re: Lemme change the buffer size and see

Post by ak77 »

Not working!

I increased the buffer size 256 and its still doing the same
Is there a limit on the buffer size?
Should I also increase the timeout?

Thanks
Kishan
ak77
Charter Member
Charter Member
Posts: 70
Joined: Thu Jun 23, 2005 5:47 pm
Location: Oklahoma

Post by ak77 »

Hi logic,

increasing the buffer size does not change anything
Is there any other issue?

Regards,
Kishan
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Stopping the job is the cause of the problem. There's a process still waiting to read data from a named pipe - you've stopped the job, so no data come along the pipe, and the waiting process takes a timeout.

If you choose to stop a job in which any form of inter-process communication is occurring, then you will get a timeout of this kind. Whether it's a timeout on a mutex lock will depend on how the particular operating system implements its semaphores.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ak77
Charter Member
Charter Member
Posts: 70
Joined: Thu Jun 23, 2005 5:47 pm
Location: Oklahoma

Is Space available in the server a reason for mutex error?

Post by ak77 »

Thanks Ray,

Is the space available in the server a reason for this error?
I am asking this because another job aborted with mutex error
It was running perfect when i used small subsets
I am using a sort and aggregator stage followed by a transformer

Will doing these in different stages help solve this problem?

Thanks again

Regards,
Kishan
ak77
Charter Member
Charter Member
Posts: 70
Joined: Thu Jun 23, 2005 5:47 pm
Location: Oklahoma

Post by ak77 »

Hello again,

I just read this one of the earlier posts
On a project I was working on we were having mutex errors while using link collectors. They were running datastage 7.5.1 on a AIX server. Seems that there was a glitch in he AIX operating system which was causing the mutex error on any jobs using a link collecter.

Anyone here that is having mutex errors running on a AIX server?

Tim
Is this an issue with AIX?

Thank you
Kishan
ak77
Charter Member
Charter Member
Posts: 70
Joined: Thu Jun 23, 2005 5:47 pm
Location: Oklahoma

Post by ak77 »

Hello Again,

I removed the Link Collector, the job is runnung fine with the default settings

Also I removed the sort stage in the other job, it completed with no problem for the whole 10 million records

Can someone explain why the transfer of data becomes a problem when using Link Collector stage?

Is it something to do with the buffer size?


Thanks again

Regards,
Kishan
ak77
Charter Member
Charter Member
Posts: 70
Joined: Thu Jun 23, 2005 5:47 pm
Location: Oklahoma

Post by ak77 »

Someone!

The job processed all the data but gave a mutex error after it finished

It fetched the right number of records from the database and the output file also gave the same the number of records when i added the rows in each link of the partitioner

these are the log statements at the end
[1. Finished Job Select
2. Select..xfmJoinTables1.OutLnkPrt1: ds_ipcput - timeout waiting for mutex
and so on]
Thanks

Regards,
Kishan
ak77
Charter Member
Charter Member
Posts: 70
Joined: Thu Jun 23, 2005 5:47 pm
Location: Oklahoma

Post by ak77 »

Hello Everbody,

Today, i connected a seq file in the output links of the link partioner and connected these seq files as input to the link collector and it worked fine with both the partitioner and collector in the same job with default settings

I am not sure if there is anything wrong that I am doing which was causing the timeout

If anybody has had similar kind of problem, can you please tell me how you solved this issue

Thank you all


Regards,
Kishan
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Are you monitoring the server load to determine if your timeouts were actually caused by significant load on the machine? When you divide your processing across more cpus, you introduced more overhead to the machine as well as opened up your process to certain issues.

If you can, please use top or prstat or glance or any other cpu load measuring tools to answer if the machine is overwhelmed during processing.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
ak77
Charter Member
Charter Member
Posts: 70
Joined: Thu Jun 23, 2005 5:47 pm
Location: Oklahoma

Post by ak77 »

Thanks Kenneth,

I will look into it and get back with you
But what i noticed is that the jobs with Link Partioner / Collector run slower than the simple straight forward job

Thanks again

Regards,
Kishan
Post Reply