Listen queue overflow: N already in queue awaiting acceptance

Luigi Rizzo rizzo at iet.unipi.it
Thu Jul 11 14:49:28 UTC 2013


On Thu, Jul 11, 2013 at 4:28 PM, Andre Oppermann <andre at freebsd.org> wrote:
> On 11.07.2013 15:35, Gleb Smirnoff wrote:
>>
>> On Thu, Jul 11, 2013 at 09:19:40AM +0200, Andre Oppermann wrote:
>> A> On 11.07.2013 09:05, Andriy Gapon wrote:
>> A> > kernel: sonewconn: pcb 0xfffffe0047db3930: Listen queue overflow: 193
>> already in
>> A> > queue awaiting acceptance
>> A> > last message repeated 113 times
>> A> > last message repeated 518 times
>> A> > last message repeated 2413 times
>> A> > last message repeated 2041 times
>> A> > last message repeated 1741 times
>> A> > last message repeated 1543 times
>> A> > last message repeated 1283 times
>> A> > last message repeated 1178 times
>> A> > last message repeated 1020 times
>> A> > ...
>> A> >
>> A> > What does this messages mean?
>> A>
>> A> That your server process lagging behind in accepting new connections
>> and a
>> A> quite a number of them get aborted due to a backlogged listen queue.
>> A>
>> A> Making the accept queue longer doesn't help, it's user-space that can't
>> keep
>> A> up with the rate of new incoming connections.
>> A>
>> A> You can either reduce the rate of new incoming connections, optimize
>> your
>> A> server process to accept more connections in the same time, or get a
>> beefier
>> A> machine.
>> A>
>> A> > Is it really that important to be printed?
>> A>
>> A> The log messages are at DEBUG level.  People probably want to know
>> about
>> A> their server not keeping up and throwing incoming connection attempts
>> away.
>> A>
>> A> > Finally, why is it not throttled?
>> A>
>> A> The frequency it happens with is important to determine if this is only
>> A> a temporary spike (micro-burst) or persistent condition.
>>
>> IMO, this should be a single counter accessible via sysctl, with no
>> printf(). Those, who need details on whether this is micro-burst or
>> persistent condition, can run monitoring software that draws plots.
>
>
> The single counter wouldn't tell you anything because it misses which
> socket/accept queue is affected by the overflow.  The inpcb pointer
> can be cross-refrenced with netstat -a.
>
> Andriy for example would never have found out about this problem other
> than receiving vague user complaints about aborted connection attempts.
> Maybe after spending many hours searching for the cause he may have
> interfered from endless scrolling in Wireshark that something wasn't
> right and blame syncache first.  Only later it would emerge that he's
> either receiving too many connections or his application is too slow
> dealing with incoming connections.
>
> If you can recommend a suitable and general sysadmin friendly monitoring
> software that will point out this problem I'm all ears.

the problem with these non-throttled messages is that they often
cause thrashing -- you become slighly slow, messages start being
generated and your system becomes a lot slower, making it hard
to recover.

What i usually do is throttle (in the kernel) and count the number of
message suppressed. Something like this (in a macro):

static int ctr, last_tick;
if (ticks - last_tick > suppression_delay) {
    printf("got this error ... (%d times)\n", ... , ctr);
    ctr = 0;
    last_tick = tick;
} else {
    ctr++;
}

the errors may not be exactly the same, the counter is race_prone
(you can make it atomic if you really feel like) but the whole point is
to get the idea that something is very wrong, not the exact count
or pointer

cheers
luigi

> --
> Andre
>
>
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"



-- 
-----------------------------------------+-------------------------------
 Prof. Luigi RIZZO, rizzo at iet.unipi.it  . Dip. di Ing. dell'Informazione
 http://www.iet.unipi.it/~luigi/        . Universita` di Pisa
 TEL      +39-050-2211611               . via Diotisalvi 2
 Mobile   +39-338-6809875               . 56122 PISA (Italy)
-----------------------------------------+-------------------------------


More information about the freebsd-net mailing list