Failed to connect to JobMonApp on port 13401
Posted: Thu Mar 24, 2016 10:37 am
Occasionally I'm seeing this error being thrown:
We're only seeing this occur in one job/project. It just happens to be our largest. The job runs every night and spawns hundreds of other jobs in the project. We've seen this happen 3-4 times in the past 3 months. Upon receiving the error, we can immediately restart the job and everything will run fine.
Can anyone offer any suggestions things to look at to help identify what might be happening? I have my server and network guys looking at things on their end to see if it's a hardware bug. I'm wondering if there could be something on the application side that would either provide more insight or if there's something that could be changed like a timeout variable or something.
Code: Select all
Failed to connect to JobMonApp on port 13401
main_program: Received SIGPIPE signal caused by closing of the socket on port 13,400.
No output will be sent to port 13,400 for the rest of the job.
Can anyone offer any suggestions things to look at to help identify what might be happening? I have my server and network guys looking at things on their end to see if it's a hardware bug. I'm wondering if there could be something on the application side that would either provide more insight or if there's something that could be changed like a timeout variable or something.