mbuf cluster shortage caused kernel panic

Mike Silbersack silby at silby.com
Wed Jul 23 14:58:10 PDT 2003


On Wed, 23 Jul 2003, Kevin A. Pieckiel wrote:

> #uname -a
> FreeBSD fileserver1.smartrafficenter.net 4.7-STABLE FreeBSD 4.7-STABLE #0: Mon Dec 16 19:41:03 EST 2002     toor at fileserver1.smartrafficenter.net:/usr/obj/usr/src/sys/FILESERVER1  i386
>
> Running 4.7 stable with sources CVSed on 16 Dec 2002.
>
> My fileserver has been running since 17 Dec 2002 and suddenly lost its
> ability to talk on the network today.  Went to the console to discover
> a flood of messages that it was out of mbuf clusters, read tuning(7)
> for more info.
>
> What can I do to help solve any problems that might exist in the kernel
> code, and what suggestions do you have to keep this from happening on
> my fileserver again?
>
> Kernel, debug kernel, CVS date, kernel config, and core file can be
> made available upon request.
>
> Thanks much,
> Kevin A. Pieckiel

Your panic seems to indicate that the mbuf cluster chain became corrupted,
which could have happened in one of a few ways.  I'll address your
question in two parts:

1.  How do I prevent the system from using all mbuf clusters.

This depends on the application you're running; next time you're in a
similar situation, you may wish to run netstat -n | more and look at the
sendq values to see if there are a large number of connections with large
sendqs that are sucking up all the mbuf clusters.

If a large number of mbuf clusters are in use without much of anything
showing up in netstat -n, then we have some sort of mbuf cluster leak,
which is much more serious.

2.  How do I prevent the system from panicing when all mbuf clusters are
used up?

This question has a more useful answer. :)

You could cvsup to 4.8-STABLE; at least two bugs which would result in
panics during mbuf exhaustion have been fixed, and an additional potential
panic causing situation has been patched.  One of those bugs may be the
same as the one that affected you, but it would be very time consuming to
figure it out.

Even if you stay with the kernel version you are at, you may want to
enable the INVARIANTS (and INVARIANT_SUPPORT) options.  This will cause
additional checks to be enabled in the kernel which will make tracking
down future panics easier.

If this problem is infrequent, I think your best course of action is to
build a 4.7 kernel with INVARIANTS for now, and plan on a 4.8-stable
upgrade at some point in the future.

Mike "Silby" Silbersack


More information about the freebsd-hackers mailing list