Head node: prodhn01
Compute node: prodcn01
My job is meant to run on the above head node and the compute node.
Sometimes i am getting this error:
When i re run the job after a minute or so, job runs fine on the same compute node.APT_PMaddrInfoFor(): getaddrinfo(host = prodcn01, port = ) failed with error -3 - Temporary failure in name resolution: node prodcn01 cannot be used
According to my understanding i think host lookup faliure must be the issue that the DNS was not able to resolve. I went to the DNS team, but they couldnt find any errors/warnings on their system.
Also, sometimes DNS lookup fails even for the head node, i did not get why would the i get the same error for the head node even if my job is running on the same head node:
Any ideas??APT_PMaddrInfoFor(): getaddrinfo(host = prodhn01, port = ) failed with error -3 - Temporary failure in name resolution: node prodhn01 cannot be used
I am thinking to update the hosts file to overcome this issue...