FreeBSD 7.0: sockets stuck in CLOSED state...

Robert Watson rwatson at FreeBSD.org
Thu Jun 26 07:50:13 UTC 2008


On Wed, 25 Jun 2008, Ali Niknam wrote:

>> precisely matches that what you'd expect: lots of TCP connections in the 
>> CLOSED state reflecting a series of connections built by an application but 
>> then not properly discarded. Likewise, when the application is killed, all 
>> of the connections go away -- most likely because the file descriptors are 
>> all closed, allowing them to be garbage collected and connection state 
>> freed.  If it is this sort of bug, then most likely you're missing a call 
>> to close() in a work loop somewhere, and in some exceptional case, you fall 
>> out of the loop without calling close().
>
> I will double check this once more, but honestly, i strongly doubt it...
>
> Also one other thing that I've noticed, is that it's always the input buffer 
> that has bytes left; never the output buffer...
>
> Moreover, i've seen that close() reports EBADF, but due to the insane amount 
> of connections I can not say for certain that that's when the connection 
> goes into CLOSED state. The ip's do match, but it's very common for the same 
> ip's to make numerous connections too.

I think the first logical step is to wait for the application to get into that 
state again, and then run procstat or fstat to dump the file descriptor away 
for the process.  Presumably in the normal steady state, you expect to see a 
few IPC sockets (syslog, etc), a TCP listen socket, and some number of 
in-progress TCP sessions.  The question, of course, is whether you see a lot 
more file descriptors than that, and in particular, ones that matched the 
CLOSED entries in netstat.  If you find that there are lots of open file 
descriptors and they match up approximately with netstat, then it's an 
application bug that just manifests a bit differently in 7.x than in 6.x.  On 
the other hand, if you see only a small number of open file descriptors, then 
we may be looking at something quite a bit more complicated.

I would next seek to confirm the analysis that "they go away when the 
application is killed" -- do they really disappear at the very moment it 
exits, or do they kind of disappear over time and it just happens that by the 
time you run netstat after killing the application, they're gone.  I.e., I'd 
try something like "netstat -na > file1 ; kill pid ; sleep 1 ; netstat -na > 
file2 ; diff -u file1 file2".  If they really all go away in a large quantity 
the moment the process dies, then the reference model is working (i.e., they 
are freed), but perhaps references are being held onto in an unexpected way. 
For example, is the incomplete listen queue somehow getting filled with CLOSED 
sockets that are only garbage collected when close() is called on the listen 
socket?  If we suspect that, we can actually test it by having your 
application close the listen socket and re-open it once in a while, and see if 
the CLOSED sockets fail to stack up.

Speaking of which, I meant to ask: are you using accept filters, and if so, 
which one?

Robert N M Watson
Computer Laboratory
University of Cambridge


More information about the freebsd-net mailing list