ETL Server space problem

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ankita
Participant
Posts: 57
Joined: Sun Nov 13, 2005 11:17 pm

ETL Server space problem

Post by ankita »

Hi All,
Few jobs are failing in production with below error.

node_node1: Fatal Error: Unable to start ORCHESTRATE process on node node1 (<ETL server name>): APT_PMPlayer::APT_PMPlayer: fork() failed, Not enough space

My understanding says that it happened when the transactional volume of these jobs (running parallely) were not fitting into the node space.Please let me know if that's the actual scenario.
If yes, then is the node space shared between stream data and persistent Datasets ?
Please advise what to do for now , also for future.

Thanks !
Ankita
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Please search the forum for this error message.

It has nothing at all to do with transactional volume. The fork() function is involved in getting processes started.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
harshada
Premium Member
Premium Member
Posts: 92
Joined: Tue May 29, 2007 8:40 am

Post by harshada »

Check the maxuproc id on unix with the help fo the command

Code: Select all

lsattr -EHl sys0 | grep maxuproc

This gives the maximum processes allowed and check the number of processes running for your DataStage job. if it exceeds the maxuproc id then one usually gets the fork() failed error. Try to kill some of the old processes or Get the maxuproc id reset or restart the unix box , one of these should solve the problem.
harshada
Premium Member
Premium Member
Posts: 92
Joined: Tue May 29, 2007 8:40 am

Post by harshada »

Maximum number of PROCESSES allowed per user
ankita
Participant
Posts: 57
Joined: Sun Nov 13, 2005 11:17 pm

Post by ankita »

Thanks for your suggestions !
I have tried below command but it can't recognize it. Ours is SunOS 5.9, may be that's why it didn't work.

$ lsattr -EHl sys0 | grep maxuproc
ksh: lsattr: not found

I was also checking the ulimit option to get the resouce limits and below is the output as in production,
$ ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 8192
coredump(blocks) unlimited
nofiles(descriptors) 256
vmemory(kbytes) unlimited

Can you please tell me some details about 'nofiles(descriptors) ' ? What should be the standard limit for it ? I see it's less than that of dev.

Thanks,
Ankita
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

ankita wrote:I was also checking the ulimit option to get the resouce limits and below is the output as in production,
$ ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 8192
coredump(blocks) unlimited
nofiles(descriptors) 256
vmemory(kbytes) unlimited
If this information was gather by executing the command by hand from the command line, it really isn't correct. You need to capture it from a running job to know what the limit is in that environment, even if you used 'the same user' at the command line. Add an 'ExecSH' call to run that command before job in any job and let us know what it reports.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply