ZFS hang issue and prefetch_disable
Matt Simerson
matt at corp.spry.com
Tue Jul 22 21:25:45 UTC 2008
Symptoms
Deadlocks under heavy IO load on the ZFS file system with
prefetch_disable=0. Setting vfs.zfs.prefetch_disable=1 results in a
stable system.
Configuration
Two machines. Identically built. Both exhibit identical behavior.
8 cores (2 x E5420) x 2.5GHz, 16 GB RAM, 24 x 1TB disks.
FreeBSD 7.0 amd64
dmesg: http://matt.simerson.net/computing/zfs/dmesg.txt
Boot disk is a read only 1GB compact flash
# cat /etc/fstab
/dev/ad0s1a / ufs ro,noatime 2 2
# df -h /
Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/ad0s1a 939M 555M 309M 64% /
RAM has been boosted as suggested in ZFS Tuning Guide
# cat /boot/loader.conf
vm.kmem_size= 1610612736
vm.kmem_size_max= 1610612736
vfs.zfs.prefetch_disable=1
I haven't mucked much with the other memory settings as I'm using
amd64 and according to the FreeBSD ZFS wiki, that isn't necessary.
I've tried higher settings for kmem but that resulted in a failed
boot. I have ample RAM And would love to use as much as possible for
network and disk I/O buffers as that's principally all this system does.
Disks & ZFS options
Sun's "Best Practices" suggests limiting the number of disks in a
raidz pool to no more than 6-10, IIRC. ZFS is configured as shown: http://matt.simerson.net/computing/zfs/zpool.txt
I'm using all of the ZFS default properties except: atime=off,
compression=on.
Environment
I'm using these machines as backup servers. I wrote an application
that generates a list of the thousands of VPS accounts we host. For
each host, it generates a rsnapshot configuration file and backs up up
their VPS to these systems via rsync. The application manages
concurrency and will span additional rsync processes if system i/o
load is below a defined thresh-hold. Which is to say, I can crank up
or down the amount of network and disk IO the system sees.
With vfs.zfs.prefetch_disable=1, a hang will occur within a few hours
(no more than a day). If I keep the i/o load (measured via iostat)
down to a low level (< 200 iops) then I still get hangs but less
frequently (1-6 days). The only way I have found to prevent the hangs
is by setting vfs.zfs.prefetch_disable=1.
Matt Simerson
More information about the freebsd-fs
mailing list