Detection of node failure too long

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
UPS
Premium Member
Premium Member
Posts: 56
Joined: Tue Oct 10, 2006 12:18 pm
Location: New Jersey

Detection of node failure too long

Post by UPS »

We are testing node failure scenarios in our datastage cluster and find that the conductor node can take a very long time to detect the failure of a compute node that is executing a job. As long as 25 minutes in one case. The job just hangs for a very long time before it aborts. Is there a setting that can be used to control this amount of time and make the conductor detect that a section leader is no longer there?
Post Reply