Listen queue overflow: N already in queue awaiting acceptance

Gleb Smirnoff glebius at FreeBSD.org
Thu Jul 11 14:52:31 UTC 2013


On Thu, Jul 11, 2013 at 04:49:25PM +0200, Luigi Rizzo wrote:
L> >> IMO, this should be a single counter accessible via sysctl, with no
L> >> printf(). Those, who need details on whether this is micro-burst or
L> >> persistent condition, can run monitoring software that draws plots.
L> >
L> >
L> > The single counter wouldn't tell you anything because it misses which
L> > socket/accept queue is affected by the overflow.  The inpcb pointer
L> > can be cross-refrenced with netstat -a.
L> >
L> > Andriy for example would never have found out about this problem other
L> > than receiving vague user complaints about aborted connection attempts.
L> > Maybe after spending many hours searching for the cause he may have
L> > interfered from endless scrolling in Wireshark that something wasn't
L> > right and blame syncache first.  Only later it would emerge that he's
L> > either receiving too many connections or his application is too slow
L> > dealing with incoming connections.
L> >
L> > If you can recommend a suitable and general sysadmin friendly monitoring
L> > software that will point out this problem I'm all ears.
L> 
L> the problem with these non-throttled messages is that they often
L> cause thrashing -- you become slighly slow, messages start being
L> generated and your system becomes a lot slower, making it hard
L> to recover.
L> 
L> What i usually do is throttle (in the kernel) and count the number of
L> message suppressed. Something like this (in a macro):
L> 
L> static int ctr, last_tick;
L> if (ticks - last_tick > suppression_delay) {
L>     printf("got this error ... (%d times)\n", ... , ctr);
L>     ctr = 0;
L>     last_tick = tick;
L> } else {
L>     ctr++;
L> }
L> 
L> the errors may not be exactly the same, the counter is race_prone
L> (you can make it atomic if you really feel like) but the whole point is
L> to get the idea that something is very wrong, not the exact count
L> or pointer

btw, there is ready function for that: ppsratecheck(), already utilized
for suppressing some error messages.

-- 
Totus tuus, Glebius.


More information about the freebsd-net mailing list