The sonewconn listen queue overflow issue

Mon Feb 3 19:42:03 UTC 2014

L.S.

I came across a lot of messages in the log since, I think, about 9.1,
that sometimes fill it up so much that all else is made invisible, of
the type:

  sonewconn: pcb 0xyyyyyyyyyyyyyyyy: Listen queue overflow: 8 already in
queue awaiting acceptance

I searched a bit on the web and came across recommendations to try
netstat -nAa to find out which program this came from.

Well, running netstat -nAa |grep pcb 0xyyyyyyyyyyyyyyyy in a loop didn't
work, it didn't give any output even though the messages kept coming in
the log during that time.

I forgot about it then recently searched a bit more, and I found this page:

  http://lawrencechen.net/2014/sonewconn-pcb-0xfffffe006acd9310-listen-queue

So a listen queue overflow of 8 is caused by a listen value of 5 (QLEN >
3 * (QLIM / 2))

Doing:
 netstat -LaAn | grep '/5 '

resulted in
ffffzzzzzzzzzzzz tcp4  0/0/5  127.0.0.1:8000

The ffffzzz value was not the value in the message log,
but the connection means in my case it had to be the junkbuster proxy
(which I still use, still works well :) ), as I didn't run anything else
locally.

So I looked into junkbuster's cource and in bind.c I changed 2 instances of:

	while (listen(fd, 5) == -1) {

to

	while (listen(fd, 10) == -1) { /* 10 instead of 5, this fixes dmesg
spamming of the type 'sonewconn: pcb 0xyyyyyyyyyyyyyyyy: Listen queue
overflow: 8 already in queue awaiting acceptance'? */

This improved the situation, but still gave the issue of the log filling
up too much. So then I changed it to 20, which gave me silence :)

Checking the latest log I couldn't find anything in the last 10 days.

So for people who have this issue I recommend calculating which value
the listen-value that the overflow-value corresponds to, then checking
as above and then it should be possible to find the daemon causing the
issue. And modify that program...

But while this removes the errors, what do these messages really
signify? I mean which didn't this happen before in earlier versions of
FreeBSD?

Some people have recommended just to stop using certain programs but
that doesn't seem a solution. listen(2) doesn't give a real clue. It
mentions changes to the queue in 4.5, so very long ago. Haven't read
commit logs to see if there are hints there :)

Btw., at the moment (running 10.0 RC2) my message log now gets filled up
with something else :)

  hwpstate0: set freq failed, err 6

This happened long ago already in 9.1 with my AMD3500+ and still now on
my AMD FX6100...

Regards,

Wouter