Detection of node failure too long

1 post • Page 1 of 1

UPS: Premium Member; Posts: 56; Joined: Tue Oct 10, 2006 12:18 pm; Location: New Jersey

Detection of node failure too long

Quote

Post by UPS » Thu Jan 18, 2007 4:21 pm

We are testing node failure scenarios in our datastage cluster and find that the conductor node can take a very long time to detect the failure of a compute node that is executing a job. As long as 25 minutes in one case. The job just hangs for a very long time before it aborts. Is there a setting that can be used to control this amount of time and make the conductor detect that a section leader is no longer there?

Post Reply

1 post • Page 1 of 1

Return to “IBM<sup>Â®</sup> DataStage Enterprise Edition (Formerly Parallel Extender/PX)”