ZFS hang issue and prefetch_disable - UPDATE

CZUCZY Gergely gergely.czuczy at harmless.hu
Wed Aug 6 09:29:48 UTC 2008


Hello,

A few weeks ago, i was exactly referring to this. Somewhere around here:
http://lists.freebsd.org/pipermail/freebsd-fs/2008-July/004796.html

The thing, that it works on pointyhat, and it works on kris@'s box, is just an
IWFM-level, not the proof of any stability, reliability.

FreeBSD is a quite stable OS, the code has a relatively good quality as far as
I've seen it, and it's quite stable. Somewhy the ZFS port seems to be an
exception, it's refused to be merged properly and the issues to be solved.

No matter how much someone tunes ZFS, no matter what you disable, it's not
garanteed, not even on the tiniest level ever, to not to freeze your box, not
to throw a panic, to keep your data and everything.

Many of us has reported this, bot noone looked into it. I know, we're free to
use something else, but that's not the point. The point is, I don't see a
meaning of a port of this quality. I know it's quite complex and whatnot, but
at this level, it cannot be run in a production environment. It's missing
reliability.

No matter how much you hack it, there's always a not-so-impossible chance, that
it will shot you in your back, when you're not watching.

I hope the latest ZFS patches will solve a lot of issues, and we won't see
problems like this anymore.

On Thu, 31 Jul 2008 13:58:26 -0700
Matt Simerson <matt at corp.spry.com> wrote:

> 
> My announcement that vfs.zfs.prefetch_disable=1 resulted in a stable  
> system was premature.
> 
> One of my backup servers (see specs below) hung. When I got onto the  
> console via KVM, it looked normal with no errors but didn't respond to  
> Control-Alt-Delete.  After a power cycle, zpool status showed 8  disks  
> FAULTED and the action state was: http://www.sun.com/msg/ZFS-8000-5E
> 
> Basically, that meant my ZFS file system and 7.5TB of data was gone.  
> Ouch.
> 
> I'm using a pair of ARECA 1231ML RAID controllers. Previously, I had  
> them configured in JBOD with raidz2. This time around, I configured  
> both controllers with one 12 disk RAID 6 volume. Now FreeBSD just sees  
> two 10TB disks which I stripe with ZFS:   zpool create back01 /dev/ 
> da0 /dev/da1
> 
> I also did a bit more fiddling with /boot/loader.conf. Previous I had:
> 
> vm.kmem_size="1536M"
> vm.kmem_size_max="1536M"
> vfs.zfs.prefetch_disable=1
> 
> This resulted in ZFS using 1.1GB of RAM (as measured using the  
> technique described on the wiki) during normal use. The system in  
> question hung during the nightly processing (which backs up some other  
> systems via rsync) and my suspicions are that when I/O load picked up,  
> it exhausted the available kernel memory and hung the system. So now I  
> have these settings on one system:
> 
> vm.kmem_size="1536M"
> vm.kmem_size_max="1536M"
> vfs.zfs.arc_min="16M"
> vfs.zfs.arc_max="64M"
> vfs.zfs.prefetch_disable=1
> 
> and the same except vfs.zfs.arc_max="256M" on the other. The one with  
> 64M uses 256MB of RAM for ZFS and the one set at 256M uses 600MB of  
> RAM. These are measured under heavy network and disk IO load being  
> generated by multiple rsync processes pulling backups from remote  
> nodes and storing it on ZFS. I am using ZFS compression.
> 
> I get much better performance now with RAID 6 on the controller and  
> ZFS striping than using raidz2.
> 
> Unless tuning the arc_ settings made the difference. Either way, the  
> system I just rebuilt is now quite a bit faster with RAID 6 than JBOD  
> + raidz2.
> 
> Hopefully tuning vfs.zfs.arc_max will result in stability. If it  
> doesn't, my next choice is upgrading to -HEAD with the recent ZFS  
> patch or ditching ZFS entirely and using geom_stripe. I don't like  
> either option.
> 
> Matt
> 
> 
> > From: Matt Simerson <matt at corp.spry.com>
> > Date: July 22, 2008 1:25:42 PM PDT
> > To: freebsd-fs at freebsd.org
> > Subject: ZFS hang issue and prefetch_disable
> >
> > Symptoms
> >
> > Deadlocks under heavy IO load on the ZFS file system with  
> > prefetch_disable=0.  Setting vfs.zfs.prefetch_disable=1 results in a  
> > stable system.
> >
> > Configuration
> >
> > Two machines. Identically built. Both exhibit identical behavior.
> > 8 cores (2 x E5420) x 2.5GHz, 16 GB RAM, 24 x 1TB disks.
> > FreeBSD 7.0 amd64
> > dmesg: http://matt.simerson.net/computing/zfs/dmesg.txt
> >
> > Boot disk is a read only 1GB compact flash
> > # cat /etc/fstab
> > /dev/ad0s1a  / ufs  ro,noatime  2 2
> >
> > # df -h /
> > Filesystem  1K-blocks   Used  Avail Capacity  Mounted on
> > /dev/ad0s1a    939M    555M    309M    64%    /
> >
> > RAM has been boosted as suggested in ZFS Tuning Guide
> > # cat /boot/loader.conf
> > vm.kmem_size= 1610612736
> > vm.kmem_size_max= 1610612736
> > vfs.zfs.prefetch_disable=1
> >
> > I haven't mucked much with the other memory settings as I'm using  
> > amd64 and according to the FreeBSD ZFS wiki, that isn't necessary.  
> > I've tried higher settings for kmem but that resulted in a failed  
> > boot. I have ample RAM And would love to use as much as possible for  
> > network and disk I/O buffers as that's principally all this system  
> > does.
> >
> > Disks & ZFS options
> >
> > Sun's "Best Practices" suggests limiting the number of disks in a  
> > raidz pool to no more than 6-10, IIRC. ZFS is configured as shown:
> > http://matt.simerson.net/computing/zfs/zpool.txt
> >
> > I'm using all of the ZFS default properties except: atime=off,  
> > compression=on.
> >
> > Environment
> >
> > I'm using these machines as backup servers. I wrote an application  
> > that generates a list of the thousands of VPS accounts we host. For  
> > each host, it generates a rsnapshot configuration file and backs up  
> > up their VPS to these systems via rsync. The application manages  
> > concurrency and will spawn additional rsync processes if system i/o  
> > load is below a defined threshhold. Which is to say, I can crank up  
> > or down the amount of disk IO the system sees.
> >
> > With vfs.zfs.prefetch_disable=0, I can trigger a hang within a few  
> > hours (no more than a day). If I keep the i/o load (measured via  
> > iostat) down to a low level (< 200 iops) then I  still get hangs but  
> > less frequently (1-6 days).  The only way I have found to prevent  
> > the hangs is by setting vfs.zfs.prefetch_disable=1.
> 
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"


-- 
Üdvölettel,

Czuczy Gergely
Harmless Digital Bt
mailto: gergely.czuczy at harmless.hu
Tel: +36-30-9702963
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20080806/2ae13997/signature.pgp


More information about the freebsd-fs mailing list