Page 1 of 1

The Section Leader on node huxd0202 has termined

Posted: Tue Jun 21, 2011 4:11 am
by prasanna_anbu
Hi,
my datastage job has failed after one hour with the following error message can you please help me out on this?

APT_CombinedOperatorController(1),1: Heap growth during runLocally(): 39846K bytes

main_program: The Section Leader on node huxd0202 has terminated unexpectedly. [processmgr/slprocess.C:235]

main_program: Releasing Section Leaders; parallel step time was 3,623.411 seconds.

Configuration file:
{
node "node1"
{
fastname "huxd0312"
pools ""
resource disk "/etl/IS/Datasets" {pools ""}
resource scratchdisk "/etl/IS/Scratch" {pools ""}
}

node "node2"
{
fastname "huxd0312"
pools ""
resource disk "/etl/IS/Datasets1" {pools ""}
resource scratchdisk "/etl/IS/Scratch1" {pools ""}
}

node "node3"
{
fastname "huxd0312"
pools ""
resource disk "/etl/IS/Datasets2" {pools ""}
resource scratchdisk "/etl/IS/Scratch2" {pools ""}
}

node "node4"
{
fastname "huxd0312"
pools ""
resource disk "/etl/IS/Datasets3" {pools ""}
resource scratchdisk "/etl/IS/Scratch3" {pools ""}
}

node "huxd0202"
{
fastname "huxd0202"
pools "db2"
resource disk "/tmp" {pools ""}
resource scratchdisk "/tmp" {pools ""}
}
}

Posted: Tue Jun 21, 2011 9:30 am
by priyadarshikunal
Do you have any other error or warning messages in this job. If yes, please post them as well.

You can also search for these kind of failure messages.

Posted: Tue Jun 21, 2011 9:36 am
by prasanna_anbu
Thanks, I have serched the failure message and unable to find the solution, since my job can able to run with less number of data but if I tried to use more data volume, the job exceeds one hour time limit and failed with the above message.
other than this there is no warnings or errors.

Posted: Tue Jun 21, 2011 4:49 pm
by ray.wurlod
My personal philosophy is never to try to diagnose messages from APT_CombinedOperatorController.

Disable operator combination and try again. That way you will ascertain precisely which operator threw the message.