Page 1 of 1

CLOSE_WAIT in netstat -a

Posted: Tue May 13, 2003 4:50 am
by desais_2001
Hi,

We are facing problem that after the disconnect, client connection is taking very long time to release the connection. When I investigated using netstat -a | grep uv, I found that these connections were in a CLOSE_WAIT state. For an unusually long period, many minutes. The close_wait session remains till the DataStage engine is restarted again. With many such CLOSE_WAIT status active, New connections are refused to datastage engine.
The server is not running out of grunt (it's a Sun Fire 6800, 8CPU 900MHz, 16GB memory, 990GB disk space (out of 6.5TB SAN)).
We need help on following.
1) What are these CLOSE_WAIT sessions? IS it between DataStage Clinet and Server or Between DataStage Server and Database Server?
2) How do we identify involved pid's for the close wait session shown in netstat -a
3) How do we restrict occuring of CLOSE_WAIT events?
4) Can we identify and kill all pid's involved in CLOSE_WAIT status without re-starting DataStage engine?

Thanks in Advance.

Sanjay




Sanjay Desai

Posted: Tue May 13, 2003 4:08 pm
by chulett
Basically, the CLOSE_WAIT on the server side means that it has received and acknowledged the FIN from the client, but hasn't sent the FIN back to the client. After a period of time, the client side's FIN_WAIT_2 just gives up and closes and you then have a half-closed connection, so to speak. (or so I was told) To avoid this state, the server needs to promptly recognize that the client has "hung up".

FWIW, I've seen a lot of references on the web that say a large number of CLOSE_WAIT status items is the sign of a "buggy application", one that does not properly handle the closing of its socket calls.

If you need something to identify the processes associated with the sockets in that state, look for "lsof" - an Open Source "List Open Files" utility. You should then be able to kill the processes binding up the sockets without needing to resort to a reboot.

-craig