grid issue : main_program: Accept timed out retries = 16

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
pascalnicolasl
Premium Member
Premium Member
Posts: 12
Joined: Thu Dec 18, 2008 8:55 am

grid issue : main_program: Accept timed out retries = 16

Post by pascalnicolasl »

Hello All,

Since today we have this problem and so far all the jobs were running fine.
Now, we are facing an issue when running the job on grid environnment.
We have this error message :
main_program: Accept timed out retries = 16

The job job is running fine when grid is disabled.

Any hint on this will be helpful.
keshav0307
Premium Member
Premium Member
Posts: 783
Joined: Mon Jan 16, 2006 10:17 pm
Location: Sydney, Australia

Post by keshav0307 »

in the grid queue , all the compute node is busy.. command qstat will show how many jobs already in queue... so delete the jobs from grid queue using qdel .......

by disabling the grid, you are by passing the queue, and overloading the compute node.
pascalnicolasl
Premium Member
Premium Member
Posts: 12
Joined: Thu Dec 18, 2008 8:55 am

Post by pascalnicolasl »

Restart of the linux box resolved the problem.

Thanks
Post Reply