ETL job aborts

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
varshanswamy
Participant
Posts: 48
Joined: Thu Mar 11, 2004 10:32 pm

ETL job aborts

Post by varshanswamy »

node_node1: Player 5 terminated unexpectedly.
main_program: Unexpected termination by Unix signal 14(SIGALRM)
main_program: Step execution finished with status = FAILED.

I wanted to know what could be the reason for the failure of these jobs
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Hello varshanswamy,

you will need to supply quite a lot more information so that someone here can help you! This is like calling up your support line and saying "my program won't run" and expecting to get help.

Overall questions - ... actually, so many I don't really know where I should start asking. But I'll give it a shot:

1. Does the job abort before processing anything?
2. Have you turned on or used any of the tracing or debugging options?
3. Are you writing to tables? Are you reading from DB tables?
4. Have you looked into the director log file details?
5. Have you looked into the UNIX logfiles?

Any information at all that might narrow down a cause will be helpful (firstly to yourself and then perhaps for those here in the forum). Put yourself in someone's shoes who is reading your message... what questions would you ask - that is the type of information that even the experts need to help solve a problem.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Can you identify the processing node associated with Player 5? (The metaphor here is an orchestra; there is a conductor, section leaders and players.) If you can, then you can narrow your search to configuration and other factors associated with that processing node, such as whether all required components are installed there.

Unfortunately, SIGALARM can be anything - it's defined as "alarm clock timeout", which only tells you that some kind of timeout - perhaps waiting for some event - has occurred.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
s1kaasam
Participant
Posts: 17
Joined: Wed Feb 02, 2005 5:11 pm
Location: virginia

Post by s1kaasam »

Hi
You havent specified what unix OS you are using.If it is Sun OS then there is a patch that has to be installed on the server and included in dsenv file LD_LIBRARY_PATH.But if you confirm that it is a sun OS then I could post the reply for you to check if the server has the patch required and then you could include that in your dsenv file.This has been a bug with Sun OS.
shravan
urahul
Participant
Posts: 9
Joined: Tue Dec 07, 2004 12:44 pm

Post by urahul »

Hello,
I am using the Sun OS (SunOS edwps405 5.8 Generic_117350-21 sun4u sparc SUNW,Sun-Fire-15000)
and facing the same problems.
Could you please advice me on the patch you mentioned and the relevent details if any.

Thanks very much
Rahul


s1kaasam wrote:Hi
You havent specified what unix OS you are using.If it is Sun OS then there is a patch that has to be installed on the server and included in dsenv file LD_LIBRARY_PATH.But if you confirm that it is a sun OS then I could post the reply for you to check if the server has the patch required and then you could include that in your dsenv file.This has been a bug with Sun OS.
s1kaasam
Participant
Posts: 17
Joined: Wed Feb 02, 2005 5:11 pm
Location: virginia

Post by s1kaasam »

Hi
The patch that you should be adding is to the desnv file on the Sun OS box.Log in as datastage admin and view your dsenv file.In the file you would have LD_LIBRARY_PATH,to it you need to add '/usr/lib/lwp' for light weight processes.

But you need to make sure that /usr/lib/lwp has been installed or exists on your box.

After you have added you need to stop and start datastage services so that it picks it up.Hope you know the steps to stop and start datastage services.

Any questions please let me know.
shravan
Amos.Rosmarin
Premium Member
Premium Member
Posts: 385
Joined: Tue Oct 07, 2003 4:55 am

Post by Amos.Rosmarin »

Hi,

I'm facing the same problem with signal 14

adding /usr/lib/lwp to the ld_lib_path did not help.
but I think it's good to have it in the lib path anyway.

Waiting for few minutes and then re-runnig the job usually works.

More ideas will be appreciated

Cheers,
Amos
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

We are currently trying to see if adding swap space helps.
I'll post if the issue reoccurs.
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
Post Reply