Jobs failing when running parallely

pradeep9081 · Post by **pradeep9081** » Tue Aug 03, 2010 11:54 am

Hi,

We are running multiple jobs (5-6 jobs) at a time from the scheuduler.
Jobs are failing and getting the below error:

Unable to start ORCHESTRATE process on node node1 (nsyrp41b): APT_PMPlayer::APT_PMPlayer: fork() failed, Not enough space.

If i run the individual job then works fine.

If there is a look up then getting the below error:

"/eeadm2/IBM/InformationServer/Server/Datasets/lookuptable.20100803.jm02qdd": No space left on device
APT_BufferOperator: Add block to queue failed. This means that your buffer file systems all ran out of file space, or that some other system error occurred. Please ensure that you have sufficient scratchdisks in either the default or "buffer" pools on all nodes in your configuration file.

we have 2 node configuration file in dev. pointing both nodes to same scratch disk space.

is this dues to the buffer size? what is the best resolution for this ?

kris007 · Post by **kris007** » Tue Aug 03, 2010 12:04 pm

pradeep9081 wrote: "/eeadm2/IBM/InformationServer/Server/Datasets/lookuptable.20100803.jm02qdd": No space left on device
APT_BufferOperator: Add block to queue failed. This means that your buffer file systems all ran out of file space, or that some other system error occurred. Please ensure that you have sufficient scratchdisks in either the default or "buffer" pools on all nodes in your configuration file.

we have 2 node configuration file in dev. pointing both nodes to same scratch disk space.

is this dues to the buffer size? what is the best resolution for this ?

Yes. The error message says it all. Your scratch disk space is full. You need to add extra space or reschedule your jobs so that they don't run at the same time.

mouthou · Post by **mouthou** » Wed Aug 04, 2010 1:21 am

Or if the diskspace management is out of your control, the job(s) can be modified to use Join stage instead of Lookup or use the lookup with the filter conditions

Barath · Post by **Barath** » Wed Aug 04, 2010 1:22 am

pradeep9081 wrote:Hi,

We are running multiple jobs (5-6 jobs) at a time from the scheuduler.
Jobs are failing and getting the below error:

Unable to start ORCHESTRATE process on node node1 (nsyrp41b): APT_PMPlayer::APT_PMPlayer: fork() failed, Not enough space.

If i run the individual job then works fine.

If there is a look up then getting the below error:

"/eeadm2/IBM/InformationServer/Server/Datasets/lookuptable.20100803.jm02qdd": No space left on device
APT_BufferOperator: Add block to queue failed. This means that your buffer file systems all ran out of file space, or that some other system error occurred. Please ensure that you have sufficient scratchdisks in either the default or "buffer" pools on all nodes in your configuration file.

we have 2 node configuration file in dev. pointing both nodes to same scratch disk space.

is this dues to the buffer size? what is the best resolution for this ?

Which Partitioning you are using . If it is Entire then change it to Hash.
If you are still getting issue with space then you need to add space what kris007 is saying correct.

DSXchange

Jobs failing when running parallely

Jobs failing when running parallely

Re: Jobs failing when running parallely

Re: Jobs failing when running parallely

Re: Jobs failing when running parallely