Help:: Listen queue overflow killing servers
Paul Macdonald
paul at ifdnrg.com
Fri Jul 26 16:57:25 UTC 2019
On 26/07/2019 17:11, David Christensen wrote:
> On 7/26/19 4:58 AM, Paul Macdonald via freebsd-questions wrote:
>> Over the past few months i've seen several boxes (4 or 5) become
>> unresponsive as a result of a Listen queue overflow state.
>
>> All are on ZFS and are std apache/php/mysql servers with nothing too
>> exotic.
>
>> /var/log/messages shows tyically;
>>
>> kernel: sonewconn: pcb 0xfffff813395e3d58: Listen queue
>> overflow: 193 already in queue awaiting acceptance (83 occurrences)
>>
>> netstat -Lan shows
>>
>> tcp4 193/0/128 x.x.x.x.443
>> tcp4 193/0/128 x.x.x.x.80
>
>
> What Apache/ PHP/ MySQL applications? Did you write them? If not,
> who did? Is everything up to date? Have you filed bug reports?
>
>
> Do the applications have logging or debugging capabilities? Have you
> enabled them? What do they say? Where is the blockage? Deadlock?
>
>
These were on servers with multiple vhosts, often running wordpress ,
but in one instance not ( which had custom software we wrote inhouse ,
but thats been in production for 19 years without this issue!)
I suspect it's too low level for application level debugging,
all i know so far is:
- servers become unresponsive, Listen queue overflow
messages in /var/log/messages
- unable to quit jails or even shutdown, tcpdrop
doesn't work (everything in CLOSE_WAIT)
- On the occasion today ( and i can;t be 100% sure, but
i siuspect always) , all the apache processes were in disk wait state,
but this was on a big new box, with a very tiny site, ( on NVMe)
All servers on FBSD12, with zfs and apache is within an
(ezjail)
Multiple load patterns, but 2 out of the 5ish issues
don't make much sense as theere would have been very little load.
Non reproducible, have sieged a couple of the affected
boxes with no effect ( and logs on a couple of boxes show no intersting
traffic, just normal)
- siege -c 255 -r 2
(pretty stressful)
(target server does now something in netstat queues
, 0-100/512 but apache stays out of disk wait , siege is (un)
sucessfull as target copes fine
run multiple times , no problem, and have now generated
about 100,000 lines more in apache log that i saw after the server went
down today ( (6600 hits to a 16C/32T + 128GB + NVme machine went down
with this earlier)
I've just hit it with 255 concurrent users over a period
of 20 mins, and it doesn;t blink
so doesn;t look like its load..... ( and that would
have shown up in the logs anyway)
> David
> _______________________________________________
> freebsd-questions at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to
> "freebsd-questions-unsubscribe at freebsd.org"
>
--
-------------------------
Paul Macdonald
IFDNRG Ltd
Web and video hosting
-------------------------
t: 0131 5548070
m: 07970339546
e: paul at ifdnrg.com
w: http://www.ifdnrg.com
-------------------------
IFDNRG
40 Maritime Street
Edinburgh
EH6 6SA
----------------------------------------------------
Virtual Servers from £50.00pm
High specification Dedicated Servers from £150.00pm
----------------------------------------------------
More information about the freebsd-questions
mailing list