Help:: Listen queue overflow killing servers
Janos Dohanics
web at 3dresearch.com
Fri Jul 26 13:11:57 UTC 2019
On Fri, 26 Jul 2019 12:58:45 +0100
Paul Macdonald via freebsd-questions <freebsd-questions at freebsd.org>
wrote:
>
> Hi,
>
> Over the past few months i've seen several boxes (4 or 5) become
> unresponsive as a result of a Listen queue overflow state.
>
> Processes stack up, none are killable, all these are within jails and
> neither the jail can be stopped nor the server rebooted (without a
> power cycle).
>
> All are on ZFS and are std apache/php/mysql servers with nothing too
> exotic.
>
> All on 12.0-RELEASE, i've only started seeing these issues recently,
> but it feels like more and more.
>
> /var/log/messages shows tyically;
>
> kernel: sonewconn: pcb 0xfffff813395e3d58: Listen queue
> overflow: 193 already in queue awaiting acceptance (83 occurrences)
>
> netstat -Lan shows
>
> tcp4 193/0/128 x.x.x.x.443
> tcp4 193/0/128 x.x.x.x.80
>
> connections cannot be killed with tcpdrop ( except ssh which can!)
>
> All processes seem to be in Disk State ( many many apache processes
> but others getting stuck too)
>
> www 60089 0.0 0.1 196588 78328 - DJ 21:07
> 1:19.54 /usr/local/sbin/httpd -DNOHTTPACCEPT
> ..<snoip>
>
> www 93713 0.0 0.0 183576 33164 - DJ 23:57
> 0:00.01 /usr/local/sbin/httpd -DNOHTTPACCEPT
>
> but no zombies..
>
> last pid: 24773; load averages: 0.00, 0.00, 0.00
> up 52+11:41:09 11:48:02
> 918 processes: 1 running, 917 sleeping
> CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
> Mem: 107M Active, 3729M Inact, 93G Wired, 27G Free
> ARC: 79G Total, 54G MFU, 23G MRU, 243M Anon, 710M Header, 1615M Other
> 73G Compressed, 191G Uncompressed, 2.60:1 Ratio
> Swap: 4096M Total, 4096M Free
>
>
> I'd appreciate any advice as at present it looks like my only option
> is to hard power cycle these
I have also been trying to find a resolution to a similar problem
(FreeBSD 12.0-STABLE r345381, virtual instace, not jail).
Apparently at random, TCP sockets on ports 110 and 143 are stuck in
CLOSE_WAIT state (cyrus 3.0.10). My understanding is that in CLOSE_WAIT
state the socket is waiting for the server application to close the
socket.
When the listening queue overflows, I too am unable restart cyrus, even
with kill -9, reboot(8) doesn't work, new ssh connection is not
accepted. Hard reboot is the only "remedy".
I have increased the cyrus listen queue from the default 32 to 128, but
I think that's just putting a larger bucket under a leaking roof.
--
Janos Dohanics
More information about the freebsd-questions
mailing list