Page 1 of 1

SIGKILL Problem

Posted: Tue Dec 12, 2006 4:50 am
by ashik_punar
Hi Everyone,

I am facing the ghost of SIGKILL in my jobs.I am calling thsi a ghost because after going through the posts in here i came to know that we don't have any particular reason and solution for this problem. I am posting the log of the job in here.The log for the job goes like this :

Occurred: 5:57:03 PM On date: 12/11/2006 Type: Control
Event: Starting Job Voy_TxtStock_job. (...)

Occurred: 5:57:04 PM On date: 12/11/2006 Type: Info
Event: Environment variable settings: (...)

Occurred: 5:57:04 PM On date: 12/11/2006 Type: Info
Event: Parallel job initiated (...)

Occurred: 5:57:04 PM On date: 12/11/2006 Type: Info
Event: main_program: Ascential DataStage(tm) Enterprise Edition 7.5.1A (...)

Occurred: 5:57:04 PM On date: 12/11/2006 Type: Info
Event: main_program: The open files limit is 2000; raising to 2147483647.

Occurred: 5:57:04 PM On date: 12/11/2006 Type: Info
Event: main_program: orchgeneral: loaded (...)

Occurred: 5:57:05 PM On date: 12/11/2006 Type: Info
Event: main_program: APT configuration file: /opt/Ascential/DataStage/Configurations/default.apt (...)

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(6),0: Failure during execution of operator logic.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(6),0: Input 0 consumed 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(6),0: Output 0 produced 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(6),0: Fatal Error: waitForWriteSignal(): Premature EOF on node tfukmhfirp1 No such file or directory

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(7),0: Failure during execution of operator logic.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(7),0: Input 0 consumed 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(7),0: Output 0 produced 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(7),0: Fatal Error: waitForWriteSignal(): Premature EOF on node tfukmhfirp1 No such file or directory

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: node_node1: Player 16 terminated unexpectedly.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: main_program: Unexpected exit status 1

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: node_node1: Player 17 terminated unexpectedly.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: main_program: Unexpected exit status 1

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(8),0: Failure during execution of operator logic.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(8),0: Input 0 consumed 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(8),0: Output 0 produced 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(8),0: Fatal Error: waitForWriteSignal(): Premature EOF on node tfukmhfirp1 No such file or directory

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: node_node1: Player 18 terminated unexpectedly.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: main_program: Unexpected exit status 1

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: CollectTxt_fu,0: Failure during execution of operator logic.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: CollectTxt_fu,0: Input 0 consumed 0 records. (...)

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: CollectTxt_fu,0: Output 0 produced 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: CollectTxt_fu,0: Fatal Error: waitForWriteSignal(): Premature EOF on node tfukmhfirp1 No such file or directory

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: node_node2: Player 4 terminated unexpectedly.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: main_program: Unexpected termination by Unix signal 9(SIGKILL)

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(7),1: Failure during execution of operator logic.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(7),1: Input 0 consumed 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(7),1: Output 0 produced 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(7),1: Fatal Error: waitForWriteSignal(): Premature EOF on node tfukmhfirp1 No such file or directory

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(8),1: Failure during execution of operator logic.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(8),1: Input 0 consumed 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(8),1: Output 0 produced 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(8),1: Fatal Error: waitForWriteSignal(): Premature EOF on node tfukmhfirp1 No such file or directory

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: node_node2: Player 16 terminated unexpectedly.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: main_program: Unexpected exit status 1

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: node_node2: Player 17 terminated unexpectedly.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: main_program: Unexpected exit status 1

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: AU052_cnt,0: Failure during execution of operator logic.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: AU052_cnt,0: Input 0 consumed 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: AU052_cnt,0: Output 0 produced 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: AU052_cnt,0: Fatal Error: waitForWriteSignal(): Premature EOF on node tfukmhfirp1 No such file or directory

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(6),2: Failure during execution of operator logic.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(6),2: Input 0 consumed 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(6),2: Output 0 produced 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(6),2: Fatal Error: waitForWriteSignal(): Premature EOF on node tfukmhfirp1 No such file or directory

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(7),2: Failure during execution of operator logic.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(7),2: Input 0 consumed 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(7),2: Output 0 produced 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(7),2: Fatal Error: waitForWriteSignal(): Premature EOF on node tfukmhfirp1 No such file or directory

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(8),2: Failure during execution of operator logic.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(8),2: Input 0 consumed 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(8),2: Output 0 produced 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(8),2: Fatal Error: waitForWriteSignal(): Premature EOF on node tfukmhfirp1 No such file or directory

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: node_node3: Player 15 terminated unexpectedly.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: main_program: Unexpected exit status 1

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: node_node3: Player 16 terminated unexpectedly.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: main_program: Unexpected exit status 1

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: node_node3: Player 17 terminated unexpectedly.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: main_program: Unexpected exit status 1

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: node_node1: Player 5 terminated unexpectedly.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: main_program: Unexpected exit status 1 (...)

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: CollectTxt_fu,3: Failure during execution of operator logic.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: CollectTxt_fu,3: Input 0 consumed 0 records. (...)

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: CollectTxt_fu,3: Output 0 produced 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: CollectTxt_fu,3: Fatal Error: waitForWriteSignal(): Premature EOF on node tfukmhfirp1 No such file or directory

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(6),3: Failure during execution of operator logic.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(6),3: Input 0 consumed 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(6),3: Output 0 produced 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(6),3: Fatal Error: waitForWriteSignal(): Premature EOF on node tfukmhfirp1 No such file or directory

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(7),3: Failure during execution of operator logic.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(7),3: Input 0 consumed 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(7),3: Output 0 produced 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(7),3: Fatal Error: waitForWriteSignal(): Premature EOF on node tfukmhfirp1 No such file or directory

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(8),3: Failure during execution of operator logic.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(8),3: Input 0 consumed 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(8),3: Output 0 produced 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(8),3: Fatal Error: waitForWriteSignal(): Premature EOF on node tfukmhfirp1 No such file or directory

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(9),3: Failure during execution of operator logic.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(9),3: Input 0 consumed 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Info
Event: buffer(9),3: Output 0 produced 0 records.

Occurred: 5:57:22 PM On date: 12/11/2006 Type: Fatal
Event: buffer(9),3: Fatal Error: waitForWriteSignal(): Premature EOF on node tfukmhfirp1 No such file or directory

-----------------------------------------------------------------------------------

I have not posted the full log because that will be huge. All i want to know is that if this is a UNIX problem or a Datastage problem. I am having 4 nodes on 4 CPUs. I am not running out of memory and no one is issuning the kill -9 command. The job runs fine most of the times but gives this thing once in a while.Can anyone please tell me something about this that whether any of you have faced this problem?Were you able to solve the same? If yes, then what is the possible solution for the same. Any type of help and input will be a great help to me.

Thanks a ton for all your help,

Posted: Tue Dec 12, 2006 5:05 am
by ajith
is this the problem with only one job or every job? , I am asking this because are you sure that your configuration file is working well.
Because, It says
Premature EOF on node tfukmhfirp1 No such file or directory
can you check the Config file for that node and check everything is all right?

I can be wrong too ... Never happened to me ...

Thanks,
Ajith

Posted: Tue Dec 12, 2006 6:24 am
by ashik_punar
Hi,

This problem is not related to a particular job.Its like this sometime a job will fail with this error and sometimes some other job will fail with this error and sometimes the sequencer will run fine without any issues. So, there is no problem with the config file as such.

Thanks for your help,

Posted: Tue Dec 12, 2006 3:37 pm
by ray.wurlod
Did you Search the forum for SIGKILL? Others have experienced this in the past.

Posted: Tue Dec 19, 2006 8:35 am
by ashik_punar
Hi Ray,

I did search the forum.But the problem is that no solution or probable reason has been provided for this thing. If this is a resource problem then for sure i can raise my concern about the same and i can request for more hardware resources. But none of the posts is saying this thing clearly.Please help me if this is a resource problem.

Thanks in Advance,

Posted: Tue Dec 19, 2006 2:34 pm
by ray.wurlod
You seem to have a large number of buffer operators inserted in the job. Can you inspect the score and try to ascertain why this might be? Also, what are the settings of the environment variables associated with buffering?

Also from the score can you determine how much use is being made of combination operators?