Hi,
The performance of jobs is repeatedly getting very bad. We rebooted the DataStage server, and brought up the Data Stage engine.
However, the performance problem still seems to persist. Very frequently the jobs abort.
It gives the following messages
node_node3: Unable to start ORCHESTRATE process on node node3 (bidev1): APT_PMPlayer::APT_PMPlayer: fork() failed, Not enough space
main_program: The Section Leader on node node3 has terminated unexpectedly.
node_node1: Unable to start ORCHESTRATE process on node node1 (bidev1): APT_PMPlayer::APT_PMPlayer: fork() failed, Not enough space
node_node2: Unable to start ORCHESTRATE process on node node2 (bidev1): APT_PMPlayer::APT_PMPlayer: fork() failed, Not enough space
node_node4: Unable to start ORCHESTRATE process on node node4 (bidev1): APT_PMPlayer::APT_PMPlayer: fork() failed, Not enough space
What is the resolution to this?
Any pointers to this are most welcome.
Thanks.
Regards,
Nitin
jobs on 4 x 4 node PX: Very slow & finally failing
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 21
- Joined: Wed Oct 01, 2003 11:53 am
First off, I don't have PX. But, a 'fork' message of this type means a new process (or pipe, etc) couldn't be forked (created) because of a lack of 'space' - and space here typically means swap space or ram resources. Could be other things like topping out a configuration parameter, but let's start there.
System specs? Any issues with RAM or swap you are aware of?
-craig
System specs? Any issues with RAM or swap you are aware of?
-craig
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
When a PX job starts, the first process is the Conductor, which starts a Player process for each node in the configuration file. (It WAS called "Orchestrate" after all!). Each player may then start one or more osh processes to do the actual work, the number also depending on your partitioning scheme(s).
What the error message is telling you is that some space resource (memory or disk) is inadequate for the number of processes you are trying to start. Try running the job using a smaller configuration (perhaps only two nodes) and monitor memory and disk space consumption. The scale-up factor to four nodes is approximately linear.
Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
What the error message is telling you is that some space resource (memory or disk) is inadequate for the number of processes you are trying to start. Try running the job using a smaller configuration (perhaps only two nodes) and monitor memory and disk space consumption. The scale-up factor to four nodes is approximately linear.
Ray Wurlod
Education and Consulting Services
ABN 57 092 448 518
quote:Originally posted by Ray.Wurlod
[br]Try running the job using a smaller configuration (perhaps only two nodes) and monitor memory and disk space consumption. The scale-up factor to four nodes is approximately linear.
To add to this, also ensure that your kernel is compiled with the configuration recommended at a minimum by Ascential as detailed on the "Install and Upgrade Guide". It make a dramatic difference in term of availability with resources.
-T.J.
* * *
... now if this can make breakfast, my life is complete.
[br]Try running the job using a smaller configuration (perhaps only two nodes) and monitor memory and disk space consumption. The scale-up factor to four nodes is approximately linear.
To add to this, also ensure that your kernel is compiled with the configuration recommended at a minimum by Ascential as detailed on the "Install and Upgrade Guide". It make a dramatic difference in term of availability with resources.
-T.J.
* * *
... now if this can make breakfast, my life is complete.