Listen queue overflow: N already in queue awaiting acceptance

Fri Jul 12 08:25:31 UTC 2013

On Thu, Jul 11, 2013 at 05:43:09PM +0200, Andre Oppermann wrote:
A> >> Andriy for example would never have found out about this problem other
A> >> than receiving vague user complaints about aborted connection attempts.
A> >> Maybe after spending many hours searching for the cause he may have
A> >> interfered from endless scrolling in Wireshark that something wasn't
A> >> right and blame syncache first.  Only later it would emerge that he's
A> >> either receiving too many connections or his application is too slow
A> >> dealing with incoming connections.
A> >
A> > That's true, but OTOH there are many interesting network conditions like
A> > excessive packet loss that we don't shout about.  The stats are quietly gathered
A> > and can be examined with netstat.  If a system is properly monitored then such
A> > counters are graphed and can trigger alarms.  If the system just misbehaves then
A> > an administrator can use netstat for inspection.
A> > Spamming logs in the case of e.g. DDoS attack is not very helpful, IMO.
A> 
A> I agree with that.
A> 
A> I try to make the system behavior more transparent so that even "hidden" problems
A> can be detected easily.  This includes adding more of them, like excessive packet
A> loss.  This makes FreeBSD a more friendly platform for sysadmins whereas previously
A> people may have quietly move on to some other OS due to such unspecific complications.
A> 
A> Most of the TCP related debugging it is protected by net.inet.tcp.log_debug.  In this
A> case it's more complicated because the socket code where this happens is protocol
A> agnostic and I can't bond it with TCP.
A> 
A> I'm currently looking into a) applying a rate limiter to the message (as suggested
A> by Luigi); and b) add a per-socket accept queue overflow statistic that is visible
A> via netstat.  I'll post patches for testing when done.

What about the following generic idea: syslogd periodically queries the kernel
about various error counters, and remembers the values. Those, that increased since
previous query are logged.

This can be implemented in different ways, either syslogd knows all the sysctls,
or kernel "pushes" a list of values to syslogd. These are details to be discussed.

What do you think about the plan itself?

-- 
Totus tuus, Glebius.