Sockets stuck in SYN_RCVD (re(4), RELENG_7, i386)

Oliver Fromme olli at
Mon Nov 26 11:37:18 PST 2007


Now I have an additional piece of information for this bug.

Today I noticed that the second system -- which did not
exhibit the problem so far -- also started collecting
sockets in the SYN_RCVD state ("netstat -n").

Extrapoliting the current count and growth rate, it must
have started on Saturday.  The machine then had an uptime
of 25 days -- about the same uptime as the first machine
when it started to show this problem.

Whatever triggers the bug, it seems to be uptime-related.
Both machines are running with HZ=1000 (the default).
A signed int variable running at HZ speed would overflow
after 2^31 seconds which happens to be 24.9 days ...

So it seems this is what's happening:  Somewhere in the
kernel (probably the TCP syncache code) there's a piece
of code using uptime information in HZ resolution for
timing purposes or whatever.  However, it uses a signed
int, maybe just for intermediate results, causing an
overflow after 2^31/HZ seconds, which leads to wrong
results and finally hanging sockets in the SYN_RCVD

Could anyone familiar help me trying to locate the bug
in the code?  I'm pretty sure that my analysis isn't far
from the truth.  I'm also pretty sure that a type cast
at the right place will fix it.  The problem is to find
the right place.

Best regards

Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:

"C is quirky, flawed, and an enormous success."
        -- Dennis M. Ritchie.

More information about the freebsd-current mailing list