Page 1 of 1

Unable to start ORCHESTRATE process help!!!

Posted: Tue Jul 26, 2005 9:38 am
by wojtask
Hi All!

Today I got following error message:
Unable to start ORCHESTRATE process on node node0 (regan): APT_PMPlayer::APT_PMPlayer: fork() failed, Not enough space [processmgr/player.C:394]

df -k shows that 52% of space is avilable (for both datasets and scratch)

ulimit -a shows:

core file size (blocks) unlimited
data seg size (kbytes) unlimited
file size (blocks) unlimited
open files 1024
pipe size (512 bytes) 10
stack size (kbytes) 8192
cpu time (seconds) unlimited
max user processes 29995
virtual memory (kbytes) unlimited

apt file contains

node "node0"
{
fastname "regan"
pools ""
resource disk "/usr/dsadm/Ascential/DataStage/Datasets" {pools ""}
resource scratchdisk "/usr/dsadm/Ascential/DataStage/Scratch" {pools ""}
}


OS is Solaris.
Amount of processed data is about 1000 records

Do you have any ideas what may cause that error?

Regards
Wojtask

Posted: Tue Jul 26, 2005 9:22 pm
by ray.wurlod
Are there a lot of other processes on the system when this error occurs? The not enough space message may relate to the process table. Check also to determine whether your inode table is full for the file system in question.

Posted: Thu Jul 28, 2005 3:06 am
by wojtask
Hi!

Unfortunately inode table changes didn't help. Here's part of job log ran after DS server reconfiguration and restart.


Regards
Wojtask

Occurred: 11:04:10 On date: 28-07-2005 Type: Fatal
Event: cs}}}(0),0: Failure during execution of operator logic [api/operator_rep.C:315]

Occurred: 11:04:10 On date: 28-07-2005 Type: Info
Event: cs}}}(0),0: Output 0 produced 0 records

Occurred: 11:04:10 On date: 28-07-2005 Type: Fatal
Event: cs}}}(0),0: Fatal Error: Fork failed: Not enough space [sort/writer.C:173]

Occurred: 11:04:10 On date: 28-07-2005 Type: Fatal
Event: cs}}},0: Failure during execution of operator logic [api/operator_rep.C:315]

Occurred: 11:04:10 On date: 28-07-2005 Type: Info
Event: cs}}},0: Output 0 produced 0 records

Occurred: 11:04:10 On date: 28-07-2005 Type: Fatal
Event: cs}}},0: Fatal Error: mmap failed: Resource temporarily unavailable [sort/writer.C:165]

Occurred: 11:04:10 On date: 28-07-2005 Type: Fatal
Event: ContainerC55.Change_Capture_390,0: Failure during execution of operator logic [api/operator_rep.C:315]

Occurred: 11:04:10 On date: 28-07-2005 Type: Info
Event: ContainerC55.Change_Capture_390,0: Output 0 produced 0 records

Occurred: 11:04:10 On date: 28-07-2005 Type: Fatal
Event: ContainerC55.Change_Capture_390,0: Fatal Error: waitForWriteSignal(): Premature EOF on node regan No such file or directory [iomgr/iocomm.C:1518]

Occurred: 11:04:10 On date: 28-07-2005 Type: Fatal
Event: cs}}},0: Failure during execution of operator logic [api/operator_rep.C:315]

Occurred: 11:04:10 On date: 28-07-2005 Type: Info
Event: cs}}},0: Output 0 produced 0 records

Occurred: 11:04:10 On date: 28-07-2005 Type: Fatal
Event: cs}}},0: Fatal Error: Fork failed: Not enough space [sort/writer.C:173]


Occurred: 11:04:10 On date: 28-07-2005 Type: Fatal
Event: subArgs={asc}}},0: Failure during execution of operator logic [api/operator_rep.C:315]

Occurred: 11:04:10 On date: 28-07-2005 Type: Info
Event: subArgs={asc}}},0: Output 0 produced 0 records

Occurred: 11:04:10 On date: 28-07-2005 Type: Fatal
Event: subArgs={asc}}},0: Fatal Error: Fork failed: Not enough space [sort/writer.C:173]

Occurred: 11:04:10 On date: 28-07-2005 Type: Fatal
Event: subArgs={asc}}},0: Failure during execution of operator logic [api/operator_rep.C:315]

Occurred: 11:04:10 On date: 28-07-2005 Type: Info
Event: subArgs={asc}}},0: Output 0 produced 0 records

Posted: Thu Jul 28, 2005 3:16 am
by ArndW
The "not enough space" is happening on a "fork" so I would look at physical and virtual memory space, plus process memory.

Stage Execution Mode?

Posted: Fri Jul 29, 2005 7:57 am
by murphykevin
I recently had a similar fork failure. Take a look at the resources that are being created by DS in your Director output log. My issue was in the JOIN stage, trying to join 6 SybaseOC tables. The job was using 5.1GB during execution. The log showed that the process was creating 35 datasets to perform the join. I changed the JOIN execution mode from parallel(default) to sequential which reduced the number of datasets to 16. The fork failure error disappeared.

Posted: Fri Jul 29, 2005 8:25 am
by wojtask
Making all stages to run in sequential mode (where possible) didn't help :(

Posted: Mon Aug 01, 2005 9:30 pm
by trokosz
I do not know what operating system your speaking of but each operating systems has a kernel setting called "max processes" and for PX for each node define in your apt there needs to be 100 processes....so 4 nodes implies 400 max proceses.....this could be your issue and I have seen this error.....see your Unix Admin