Parallel error while lookup

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
samba
Premium Member
Premium Member
Posts: 62
Joined: Wed Dec 07, 2005 11:44 am

Parallel error while lookup

Post by samba »

I am getting these type of fatal errors while i ran the job


node_node1: Player 11 terminated unexpectedly.
main_program: Unexpected termination by Unix signal 9(SIGKILL)
main_program: Unexpected exit status 1
Unexpected exit status 1
Unexpected exit status 1
Unexpected exit status 1
Unexpected exit status 1
main_program: Step execution finished with status = FAILED.

when i rerun the job with no modification in job, job completes successfully with no warnings.

The job contains serveral lookups from one particular table and other lookups from other tables.

ora ----> Trans -----> lookupstage ------>

here i have 8 lookups from different tables and 7 lookups from same table with different queries.

Data from lookups tables also limited around 500 records

i am not getting frequently. 10 times out 1 time fatals


Thanks in advance
samba
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

This error may occur randamly due to over load of the server. The job might be aborted at a peak load and when you rerun at a reduced load it might go through. Try to split the job with fewer lookup in a single job. If you have transformer and a heavy transformations try to split it. Also try to have Combinability mode to false. If the job has been sequenced, try to sequece the resource hungry job to be executed sequently with other jobs.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

No it's not overload. Someone has killed one of the processes (if not more) with kill -9 on the server. That's what generated the "Unexpected termination by Unix signal 9 (SIGKILL)" message. If you do that you deserve all of the problems that you get. If it was someone else, penalize them in some way.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Ray, I have face similar issue, when holding very huge transformations. I recall I had raised a post for the same. Though none explicitly issue a SIGKILL, it implicitly gets generated, at least by the controller of the code. May be due to Conductor, not sure.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Post Reply