Help:: Listen queue overflow killing servers

Janos Dohanics web at 3dresearch.com
Fri Jul 26 13:11:57 UTC 2019


On Fri, 26 Jul 2019 12:58:45 +0100
Paul Macdonald via freebsd-questions <freebsd-questions at freebsd.org>
wrote:

> 
> Hi,
> 
> Over the past few months i've seen several boxes (4 or 5) become 
> unresponsive as a result of a Listen queue overflow state.
> 
> Processes stack up, none are killable, all these are within jails and 
> neither the jail can be stopped nor the server rebooted (without a
> power cycle).
> 
> All are on ZFS and are std apache/php/mysql servers with nothing too
> exotic.
> 
> All on 12.0-RELEASE, i've only started seeing these issues recently,
> but it feels like more and more.
> 
> /var/log/messages shows tyically;
> 
>      kernel: sonewconn: pcb 0xfffff813395e3d58: Listen queue
> overflow: 193 already in queue awaiting acceptance (83 occurrences)
> 
> netstat -Lan  shows
> 
> tcp4 193/0/128                          x.x.x.x.443
> tcp4  193/0/128                          x.x.x.x.80
> 
> connections cannot be killed with tcpdrop ( except ssh which can!)
> 
> All processes seem to be in Disk State ( many many apache processes
> but others getting stuck too)
> 
> www      60089    0.0 0.1  196588   78328  -  DJ   21:07
> 1:19.54 /usr/local/sbin/httpd -DNOHTTPACCEPT
> ..<snoip>
> 
> www      93713    0.0 0.0  183576   33164  -  DJ   23:57
> 0:00.01 /usr/local/sbin/httpd -DNOHTTPACCEPT
> 
> but no zombies..
> 
> last pid: 24773;  load averages:  0.00,  0.00, 0.00                   
>      up 52+11:41:09  11:48:02
> 918 processes: 1 running, 917 sleeping
> CPU:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
> Mem: 107M Active, 3729M Inact, 93G Wired, 27G Free
> ARC: 79G Total, 54G MFU, 23G MRU, 243M Anon, 710M Header, 1615M Other
>       73G Compressed, 191G Uncompressed, 2.60:1 Ratio
> Swap: 4096M Total, 4096M Free
> 
> 
> I'd appreciate any advice as at present it looks like my only option
> is to hard power cycle these

I have also been trying to find a resolution to a similar problem
(FreeBSD 12.0-STABLE r345381, virtual instace, not jail).

Apparently at random, TCP sockets on ports 110 and 143 are stuck in
CLOSE_WAIT state (cyrus 3.0.10). My understanding is that in CLOSE_WAIT
state the socket is waiting for the server application to close the
socket.

When the listening queue overflows, I too am unable restart cyrus, even
with kill -9, reboot(8) doesn't work, new ssh connection is not
accepted. Hard reboot is the only "remedy".

I have increased the cyrus listen queue from the default 32 to 128, but
I think that's just putting a larger bucket under a leaking roof.

-- 
Janos Dohanics


More information about the freebsd-questions mailing list