The sonewconn listen queue overflow issue

Tue Feb 11 19:49:48 UTC 2014

On Monday, February 03, 2014 2:33:11 pm Photo stuff wrote:
> L.S.
> 
> I came across a lot of messages in the log since, I think, about 9.1,
> that sometimes fill it up so much that all else is made invisible, of
> the type:
> 
>   sonewconn: pcb 0xyyyyyyyyyyyyyyyy: Listen queue overflow: 8 already in
> queue awaiting acceptance
> 
> I searched a bit on the web and came across recommendations to try
> netstat -nAa to find out which program this came from.
> 
> Well, running netstat -nAa |grep pcb 0xyyyyyyyyyyyyyyyy in a loop didn't
> work, it didn't give any output even though the messages kept coming in
> the log during that time.
> 
> I forgot about it then recently searched a bit more, and I found this page:
> 
>   http://lawrencechen.net/2014/sonewconn-pcb-0xfffffe006acd9310-listen-queue
> 
> So a listen queue overflow of 8 is caused by a listen value of 5 (QLEN >
> 3 * (QLIM / 2))
> 
> Doing:
>  netstat -LaAn | grep '/5 '
> 
> resulted in
> ffffzzzzzzzzzzzz tcp4  0/0/5  127.0.0.1:8000
> 
> The ffffzzz value was not the value in the message log,
> but the connection means in my case it had to be the junkbuster proxy
> (which I still use, still works well :) ), as I didn't run anything else
> locally.
> 
> So I looked into junkbuster's cource and in bind.c I changed 2 instances of:
> 
> 	while (listen(fd, 5) == -1) {
> 
> to
> 
> 	while (listen(fd, 10) == -1) { /* 10 instead of 5, this fixes dmesg
> spamming of the type 'sonewconn: pcb 0xyyyyyyyyyyyyyyyy: Listen queue
> overflow: 8 already in queue awaiting acceptance'? */
> 
> This improved the situation, but still gave the issue of the log filling
> up too much. So then I changed it to 20, which gave me silence :)
> 
> Checking the latest log I couldn't find anything in the last 10 days.
> 
> So for people who have this issue I recommend calculating which value
> the listen-value that the overflow-value corresponds to, then checking
> as above and then it should be possible to find the daemon causing the
> issue. And modify that program...
> 
> But while this removes the errors, what do these messages really
> signify? I mean which didn't this happen before in earlier versions of
> FreeBSD?

Maybe earlier versions just dropped the connections without logging a message?  
The message means that connections are arriving faster than the userland
program can accept them.  There is a stat for that in 'netstat -s -p tcp' and 
you can see if it was increasing on older versions of FreeBSD (if you still 
have a machine with that around) to make sure the only change is the 
additional log message.

> Btw., at the moment (running 10.0 RC2) my message log now gets filled up
> with something else :)
> 
>   hwpstate0: set freq failed, err 6
> 
> This happened long ago already in 9.1 with my AMD3500+ and still now on
> my AMD FX6100...

I am not familiar with how hwpstate works, sorry.

-- 
John Baldwin