Page 1 of 3

Fatal : node_node1: Player 2 terminated unexpectedly.

Posted: Thu Jun 01, 2006 7:48 am
by mctny
Hello everyone,

I was wondering if anyone knows about this error I got while running scheduled job at night, the error is not descriptive it says
"node_node1: Player 2 terminated unexpectedly." the next error is
main_program: Unexpected termination by Unix signal 9(SIGKILL)

any comments?

Posted: Thu Jun 01, 2006 8:13 am
by ashwin141
Hi Cetin

I had faced something similar. Though I am not sure about the exact reason for this error. It may have something to do with the disk (resource and scratch) space.

Regards
Ashwin

Posted: Thu Jun 01, 2006 9:00 am
by kris007
You got that error because someone tried to kill the unix processes with kill -9. Never try to kill a process using kill -9. It's not a good practise.

Posted: Thu Jun 01, 2006 9:27 am
by DSguru2B
Was it a DB2 database that you were loading to?

Posted: Thu Jun 01, 2006 9:41 am
by kris007
Is there anything we need to look out if it is a DB2 database. Just Curious :?:

Posted: Thu Jun 01, 2006 10:44 am
by kumar_s
Perhaps DSguru2B might have got the same error while loading any table..

mctny - Waht is the load of your server when you got this error?
What is the volume of the data you are working on?
I hope this is ramdon, am I right?
Is there any one who had issued KILL -9 command from unix?
What is the value of APT_MONITOR_SIZE and APT_MONITOR_TIME in your adminstrator Environmental settings?

Posted: Thu Jun 01, 2006 10:52 am
by nivaskvs
Becuase its the PX version running on AIX unix , yuo are gitting this error, Thier is a Patch for thix issue to fix it permenently. you can contact Ascential Support team to get this patch........ A quick fix would be a restart for the job.

Posted: Thu Jun 01, 2006 1:54 pm
by mctny
nivaskvs wrote:Becuase its the PX version running on AIX unix , yuo are gitting this error, Thier is a Patch for thix issue to fix it permenently. you can contact Ascential Support team to get this patch........ A quick fix would be a restart for the job.
Thank you guys for all the comments,
I don't think anyone can issue killl 9 command, it is a nightly running job. no one would have access to the unix boxes at night time.
yes that job runs successfully every night,
I tend to agree that it could be from scratch disck space, but I am very new I don't know how to fix that problem. I don't know how to check those APT parameters values either,
answer to other questions
it is an oracle database we are trying to load data. it is not DB2. we are using Datastage Enterprise edition 7.5.1.A

thanks again

Posted: Mon Aug 21, 2006 11:44 pm
by yakiku
Hi nivaskvs:

Do you have any reference name for this patch? We are seeing the same problem with our PX jobs. Job would fail without any apparent reason but aborts with this error:

main_program: Unexpected termination by Unix signal 9(SIGKILL).

Upon rerunning the job, it executes fine.

Posted: Tue Aug 22, 2006 2:48 am
by kumar_s
yakiku wrote:Hi nivaskvs:

Do you have any reference name for this patch? We are seeing the same problem with our PX jobs. Job would fail without any apparent reason but aborts with this error:

main_program: Unexpected termination by Unix signal 9(SIGKILL).

Upon rerunning the job, it executes fine.
What is the load of the server while you getting this error. Have you tried the option suggested with MONITOR?

Posted: Tue Aug 22, 2006 5:00 am
by mali_aydin
Hi Cetin,
Empty files or null values is one of the cause of this problem. You have to use extra controls that handles null data or empty files.



MAli


yakiku wrote:Hi nivaskvs:

Do you have any reference name for this patch? We are seeing the same problem with our PX jobs. Job would fail without any apparent reason but aborts with this error:

main_program: Unexpected termination by Unix signal 9(SIGKILL).

Upon rerunning the job, it executes fine.

Posted: Tue Aug 22, 2006 8:33 am
by yakiku
kumar: system was at its lowest level of load at the time of the error.

MAli: The same is run a minute after the failure without chaning code/data/params, it ran fine. Does not seem logical..

Posted: Tue Aug 22, 2006 8:52 am
by samba
I am also faced with same problem before...
same exact problem i got couple of months back.
we increase the buffer size.(i dont know how to increase the buffer size)
and after that we never faced that problem.

Thanks

Posted: Tue Aug 22, 2006 7:04 pm
by kumar_s
Check in Adminstrator for Buffer settings for project level setting.
yakiku - How many jobs were parallely been called? How many stages in each jobs?

Posted: Thu Aug 24, 2006 4:09 pm
by yakiku
There was only one job running at the time of this error and there were total 12 stages in the job ( Sequential files, Filter, Join, Funnel, Lookup, Transformers and TD Api.)

These are the buffering variables at project level:

APT_BUFFERING_POLICY Automatic buffering
APT_BUFFER_DISK_WRITE_INCREMENT 1048576
APT_BUFFER_FREE_RUN 0.5
APT_BUFFER_MAXIMUM_MEMORY 3145728
APT_BUFFER_MAXIMUM_TIMEOUT 1