Sockets stuck in SYN_RCVD (re(4), RELENG_7, i386)
olli at lurza.secnetix.de
Mon Nov 26 11:37:18 PST 2007
Now I have an additional piece of information for this bug.
Today I noticed that the second system -- which did not
exhibit the problem so far -- also started collecting
sockets in the SYN_RCVD state ("netstat -n").
Extrapoliting the current count and growth rate, it must
have started on Saturday. The machine then had an uptime
of 25 days -- about the same uptime as the first machine
when it started to show this problem.
Whatever triggers the bug, it seems to be uptime-related.
Both machines are running with HZ=1000 (the default).
A signed int variable running at HZ speed would overflow
after 2^31 seconds which happens to be 24.9 days ...
So it seems this is what's happening: Somewhere in the
kernel (probably the TCP syncache code) there's a piece
of code using uptime information in HZ resolution for
timing purposes or whatever. However, it uses a signed
int, maybe just for intermediate results, causing an
overflow after 2^31/HZ seconds, which leads to wrong
results and finally hanging sockets in the SYN_RCVD
Could anyone familiar help me trying to locate the bug
in the code? I'm pretty sure that my analysis isn't far
from the truth. I'm also pretty sure that a type cast
at the right place will fix it. The problem is to find
the right place.
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606, Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758, Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart
FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd
"C is quirky, flawed, and an enormous success."
-- Dennis M. Ritchie.
More information about the freebsd-current