FreeBSD 6.2-RC2 Available - networking zoneli freeze problem still exist.

Thomas Herrlin junics-fbsdstable at atlantis.maniacs.se
Wed Jan 10 19:21:15 UTC 2007


Bruce A. Mah wrote:
> If memory serves me right, LI Xin wrote:
>> Ken Smith wrote:
>>> On Thu, 2006-12-28 at 16:01 +0100, Thomas Herrlin wrote:
>>>> It still runs networking daemons into a frozen zoneli state on
>>>> heavy/(D)DOS network loads. Such processes cant be kill-9ed so there is
>>>> no way to recover from it. (think frozen sshd and a very remote/headless
>>>> server).
>>>> See the stress test panic called 'Ran out of "128 Bucket"
>>>> <http://people.FreeBSD.org/%7Epho/stress/log/cons210.html>' on the 6.2
>>>> todo list and my own latest test here:
>>>> http://www.maniacs.se/~junics/temp/vmstat-z.txt
>>>> This test was on a new 6.2-RC2 install with no zone limit tweaks nor any
>>>> sbsize limits in /etc/login.conf.
>>>> I just made a vm disk image with replication instructions, however Peter
>>>> Holm have replicated it with his own tools so i have not bothered with
>>>> it until now. 
>>> That problem is being worked on but won't be fixed for 6.2-REL.
>>> Depending on how complex the fix winds up being it may be an Errata
>>> candidate when the time comes.
>> Perhaps we should mention some known workarounds in the errata
>> documentation.  E.g. raising nmbclusters limit, etc.?
> 
> That's a good idea.  Do you have more specifics (e.g. any particular
> nmbclusters value, other workarounds, etc.)?
> 
> Thanks,
> 
> Bruce.
> 

The most reliable way of avoiding zoneli according to my tests is
setting an sbsize limit in /etc/login.conf to a value lower than the
mbuf_cluster zone size limitation, note that there are 2048 bytes per
cluster. (See vmstat -z for details)
Or set the login.conf sbsize to a fraction of available RAM and combine
this with the 0/unlimited setting as some recommend.
Combining these two workarounds would probably be best, as setting mbuf
to use unlimited ram for networking would cause a panic or freeze sooner
or later anyway. I have not tested combining this yet as my system has
been running stable for some time now with my current workarounds.

Problems with sbsize limit:
Setting sbsize in login.conf will lead to that some processes will run
into a problem that they cannot allocate socket buffers in some extreme
cases, however this will not affect overall system stability and that is
my first priority.

I have also thrown together a small executable that attempts local
connection to its sshd with a the preliminary ssh handshake and that can
be used with watchdogd -e parameter to reboot the box. This is mainly
for headless/remote servers that MUST NOT have its sshd frozen.

You can also read my mail to the fbsd-current list with the subject "Re:
zonelimit livelock, some possable workarounds"

/Thomas Herrlin


More information about the freebsd-stable mailing list