BETA3 crash (zfs related ?)
Javier Martín Rueda
jmrueda at diatel.upm.es
Fri Nov 30 12:27:34 PST 2007
Alexandre Biancalana escribió:
> Hi list,
> My Backup Server is running 7-BETA3 from 3 days ago, is a single
> processor Core2 Duo with 2GB Ram AMD64 SMP Kernel with ZFS.
> Last night the machine rebooted after a crash, bellow are my dmesg
> and some messages that I get from /var/log/messsages.
> Let me know if you need some other information.
Although this is a guess, I'm relatively confident that you are having
the same problem I had, because my impression is that right this moment
anyone who uses ZFS intensively with a current kernel (7-BETA) will meet
this problem for sure. I'll explain:
I set up a BETA2 system, I enabled ZFS, created a pool, a few
filesystems, and everything seemed OK. However, when I started using the
filesystem heavily, it wouldn't take more than a few minutes for a panic
to show up. The message was "kmem_malloc(XXX): kmem_map too small: YYY
total allocated". I updated the sources to BETA3, but the problem was
the same. This panic means that some kernel subsystem has attempted to
use too much memory. If that happens, the kernel will just panic.
After investigating a little bit, it seems that the ARC (the ZFS cache)
was using too much memory. When you boot a FreeBSD system, the kernel
uses certain formulas to set a maximum on how much memory it will use
(sysctl vm.kmem_size, vm.kmem_size_max). For instance, on my 4 GiB
system, the kernel would set the limit at 400 MiB. The ARC also sets a
limit on how much of that memory it will use (80% I think, sysctl
vfs.zfs.arc_max). On my system, it was 320 MiB. The problem is that for
some reason, the ARC actually uses more memory that its limit, and if it
goes beyond the global kernel limit, that's when you get the panic.
According to some messages I read a few days ago, there was a recent
change on how to compute how much memory the kernel was using, and only
since then you can get this ZFS panic. I don't really know if the
culprit is the ARC because it goes over its limit, or the way the memory
is accounted for because it overestimates the ARC memory usage, or if
there is some other problem.
I solved the problem by telling the kernel to increase its global limit
well over the ARC limit. That way, even if the ARC uses more memory than
it should, it will not hit the global limit. Doing some tests, I
observed that if the ARC maximum limit was 320 MiB, you could see usage
to increase occasionally to about 650 MiB (vmstat -m | fgrep solaris).
As my system has 4 GiB, and that's much more than I really need for my
processes, I decided to add a good safety margin and set the global
kernel limit to 1.2 GiB, and the ARC limit to 320 MiB. You can always
fit it a little bit better if you can't spare so much space. I haven't
done any benchmarking as I have plenty of memory anyway. After doing
that I haven't had any single panic.
This is my setup in /boot/loader.conf:
I warn you that if you set those limits incorrectly the kernel may panic
as soon as it starts booting (and you won't be able to invoke an editor
to correct the file, of course). So, either have a FreeSBIE CD handy in
case you have to boot from it to edit the file, or make sure you know
how to use Option 6 of the FreeBSD boot menu (Escape to loader prompt)
to change the parameters before booting.
More information about the freebsd-current