Job aborts due to error in writeBlock - could not write

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
pdntsap
Premium Member
Premium Member
Posts: 107
Joined: Mon Jul 04, 2011 5:38 pm

Job aborts due to error in writeBlock - could not write

Post by pdntsap »

Hello,

The parallel job has a sequential file input and a lookup stage that lookups into two files and outputs data to a sequential file. The following error happens occasionally. The job might abort the first time but if the job is recompiled and run again it runs fine and the process can be repeated for about 10 times without any porblems and may again fail with the same error when the job is run for the 12 time.

node_node1: Player 5 terminated unexpectedly.
main_program: Unexpected termination by Unix signal 9(SIGKILL)
buffer(0),0: Error in writeBlock - could not write 131000
buffer(0),0: Failure during execution of operator logic.
buffer(0),0: Input 0 consumed 0 records.
buffer(0),0: Output 0 produced 0 records.
buffer(0),0: Fatal Error: APT_BufferOperator::writeAllData() write failed. This is probably due to a downstream operator failure.
node_node1: Player 7 terminated unexpectedly.
main_program: Unexpected exit status 1
Transformer_11,0: Failure during execution of operator logic.
Transformer_11,0: Input 0 consumed 0 records.
Transformer_11,0: Output 0 produced 0 records.
main_program: Unexpected exit status 1
Unexpected exit status 1
Unexpected exit status 1
Unexpected exit status 1
Unexpected exit status 1

The $APT_DISABLE_COMBINATION is set to true. I am thinking the error is due to memory/buffer issue. Any help is appreciated.

Thanks.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

I think you have either 2 jobs writing to the same sequential file or 2 stages.
Mamu Kim
pdntsap
Premium Member
Premium Member
Posts: 107
Joined: Mon Jul 04, 2011 5:38 pm

Post by pdntsap »

An update:

The $APT_DISABLE_COMBINATION set to False, I get the following info when I run the job.

APT_CombinedOperatorController(1),0: Numeric string expected for stage variable 'StageVar0_StageVar2'. Use default value.

So, I change $APT_DISABLE_COMBINATION to true and when I run the job, I get the following error.

node_node4: Player 3 terminated unexpectedly.
buffer(0),3: Error in writeBlock - could not write 131004
buffer(0),3: Failure during execution of operator logic.
buffer(0),3: Input 0 consumed 0 records.
buffer(0),3: Output 0 produced 0 records.
buffer(0),3: Fatal Error: APT_BufferOperator::writeAllData() write failed. This is probably due to a downstream operator failure.


So, when $APT_DISABLE_COMBINATION is set to false, the job completes successfuly but when $APT_DISABLE_COMBINATION is set to true the job aborts. What can be causing this particular behavior?

Thanks.






Please see below for a the full log details


Occurred: 8:34:25 PM On date: 8/25/2011 Type: Fatal
Event: node_node4: Player 3 terminated unexpectedly.

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Fatal
Event: buffer(0),3: Error in writeBlock - could not write 131004

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Fatal
Event: buffer(0),3: Failure during execution of operator logic.

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Info
Event: buffer(0),3: Input 0 consumed 0 records.

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Info
Event: buffer(0),3: Output 0 produced 0 records.

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Fatal
Event: buffer(0),3: Fatal Error: APT_BufferOperator::writeAllData() write failed. This is probably due to a downstream operator failure.

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Fatal
Event: node_node4: Player 5 terminated unexpectedly.

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Fatal
Event: main_program: Unexpected exit status 1 (...)

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Fatal
Event: Transformer_11,3: Failure during execution of operator logic.

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Info
Event: Transformer_11,3: Input 0 consumed 0 records.

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Info
Event: Transformer_11,3: Output 0 produced 0 records.

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Fatal
Event: Transformer_11,3: Fatal Error: waitForWriteSignal(): Premature EOF on node f1xrv04a No such file or directory

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Fatal
Event: main_program: Unexpected exit status 1 (...)

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Fatal
Event: Lookup_5,1: Failure during execution of operator logic.

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Info
Event: Lookup_5,1: Input 0 consumed 0 records. (...)

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Info
Event: Lookup_5,1: Output 0 produced 0 records.

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Fatal
Event: Lookup_5,1: Fatal Error: waitForWriteSignal(): Premature EOF on node f1xrv04a No such file or directory

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Fatal
Event: Transformer_11,1: Failure during execution of operator logic.

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Info
Event: Transformer_11,1: Input 0 consumed 0 records.

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Info
Event: Transformer_11,1: Output 0 produced 0 records.

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Fatal
Event: Transformer_11,1: Fatal Error: waitForWriteSignal(): Premature EOF on node f1xrv04a No such file or directory

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Fatal
Event: buffer(0),1: Error in writeBlock - could not write 131004

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Fatal
Event: main_program: Unexpected exit status 1 (...)

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Fatal
Event: Lookup_5,0: Failure during execution of operator logic.

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Info
Event: Lookup_5,0: Input 0 consumed 18 records. (...)

Occurred: 8:34:25 PM On date: 8/25/2011 Type: Info
Event: Lookup_5,0: Output 0 produced 2 records. (...)

Occurred: 8:34:30 PM On date: 8/25/2011 Type: Fatal
Event: Lookup_5,0: Fatal Error: Unable to allocate communication resources

Occurred: 8:34:30 PM On date: 8/25/2011 Type: Fatal
Event: main_program: Step execution finished with status = FAILED.

Occurred: 8:34:30 PM On date: 8/25/2011 Type: Info
Event: main_program: Startup time, 0:09; production run time, 0:05.
Post Reply