Listen queue overflow: N already in queue awaiting acceptance

Andre Oppermann andre at freebsd.org
Thu Jul 11 14:28:34 UTC 2013


On 11.07.2013 15:35, Gleb Smirnoff wrote:
> On Thu, Jul 11, 2013 at 09:19:40AM +0200, Andre Oppermann wrote:
> A> On 11.07.2013 09:05, Andriy Gapon wrote:
> A> > kernel: sonewconn: pcb 0xfffffe0047db3930: Listen queue overflow: 193 already in
> A> > queue awaiting acceptance
> A> > last message repeated 113 times
> A> > last message repeated 518 times
> A> > last message repeated 2413 times
> A> > last message repeated 2041 times
> A> > last message repeated 1741 times
> A> > last message repeated 1543 times
> A> > last message repeated 1283 times
> A> > last message repeated 1178 times
> A> > last message repeated 1020 times
> A> > ...
> A> >
> A> > What does this messages mean?
> A>
> A> That your server process lagging behind in accepting new connections and a
> A> quite a number of them get aborted due to a backlogged listen queue.
> A>
> A> Making the accept queue longer doesn't help, it's user-space that can't keep
> A> up with the rate of new incoming connections.
> A>
> A> You can either reduce the rate of new incoming connections, optimize your
> A> server process to accept more connections in the same time, or get a beefier
> A> machine.
> A>
> A> > Is it really that important to be printed?
> A>
> A> The log messages are at DEBUG level.  People probably want to know about
> A> their server not keeping up and throwing incoming connection attempts away.
> A>
> A> > Finally, why is it not throttled?
> A>
> A> The frequency it happens with is important to determine if this is only
> A> a temporary spike (micro-burst) or persistent condition.
>
> IMO, this should be a single counter accessible via sysctl, with no
> printf(). Those, who need details on whether this is micro-burst or
> persistent condition, can run monitoring software that draws plots.

The single counter wouldn't tell you anything because it misses which
socket/accept queue is affected by the overflow.  The inpcb pointer
can be cross-refrenced with netstat -a.

Andriy for example would never have found out about this problem other
than receiving vague user complaints about aborted connection attempts.
Maybe after spending many hours searching for the cause he may have
interfered from endless scrolling in Wireshark that something wasn't
right and blame syncache first.  Only later it would emerge that he's
either receiving too many connections or his application is too slow
dealing with incoming connections.

If you can recommend a suitable and general sysadmin friendly monitoring
software that will point out this problem I'm all ears.

-- 
Andre



More information about the freebsd-net mailing list