Page 1 of 1

Limits on number of links out of a Switch/ into a Join

Posted: Fri May 18, 2007 5:06 am
by miwinter
Hi guys,

This is a bit of a long shot but...

I have a job design here which uses a Switch stage which feeds a Join. The Switch and Join are both fed by datasets. On the Switch output, there are 18 links in all. We frequently get failures on this job, but it is not contention-related as I have tried running these one by one, in isolation, and furthermore, it doesn't seem to be volume-related as it happens equally on small and larger data volumes.

The error we see seems to actually relate to the Join stage (which takes in the 18 links from the Switch), namely:

"node_node1: Player 19 terminated unexpectedly"
"main_program: Unexpected termination by Unix signal 9(SIGKILL)"

Does anyone know if there is some kind of limit we should adhere to, when exiting data from a Switch or the number of links a Join can take as an input?

Multiple re-runs of these jobs do end successfully (usually after 3 or 4 attempts)

Cheers fellas

Posted: Fri May 18, 2007 5:56 am
by JoshGeorge
Set APT_NO_JOBMON to True. This should solve your problem. This was identified earlier in this mighty site :)

Posted: Fri May 18, 2007 7:06 am
by miwinter
Cheers Josh, I'll give that a spin :D


EDIT...

PS. Does anyone know the reason behind this issue?

Posted: Thu Jan 01, 2009 2:28 am
by attu
miwinter wrote:Cheers Josh, I'll give that a spin :D


EDIT...

PS. Does anyone know the reason behind this issue?
We had the same issue and after disabling APT_NO_JOBMON the job ran fine. My question is, is it a Bug with 7.51A and AIX 5.3 (we have ML 6 SP 3) ? Why will only this job fail and why not the others ?
Appreciate your responses.
Thanks

Posted: Thu Jan 01, 2009 4:22 am
by ray.wurlod
So your job also features a Switch stage with 18 links?

If not, you're hijacking this thread, which we frown upon.

Please begin a new thread, with a meaningful subject.

This will assist future searchers.