ZFSKnownProblems - needs revision?

Thu Apr 9 05:52:38 UTC 2009

Ivan Voras wrote:
> Ivan Voras wrote:
> 
>> * Are the issues on the list still there?
>> * Are there any new issues?
>> * Is somebody running ZFS in production (non-trivial loads) with
>> success? What architecture / RAM / load / applications used?
>> * How is your memory load? (does it leave enough memory for other services)
> 
> also: what configuration (RAIDZ, mirror, etc.?)

With 7.0 and 7.1 I had frequent livelocks with ZFS when it was unable to 
get more memory from the kernel.  Tuning kmem up to 768K helped but 
didn't 100% eliminate the issues... still had to reboot systems every 
few weeks.  Since most were either in a redundant cluster or personal 
machines, I kinda lived with it.

With the increase in default kmem size in 7.2, I removed the kmem_size 
tuning and have not had a single ZFS-related hang since.  The only 
tuning I use is vfs.zfs.arc_max="100M" and disabling prefetch (but 
leaving zil on) -- and I'm seriously considering throwing those two out 
(or at least raising arc_max) and seeing what happens.  I'm much happier 
with how 7.2's ZFS behaves than 7.1's.  It's definitely "getting 
there"... with one catch (see below).

Anyone running an up-to-date STABLE -- on amd64, anyway -- should 
consider removing kmem_size tweaks from their loader.conf...  You don't 
need 8.x for that, just 7.2-prerelease/beta.  I don't know about i386, 
I'd be a bit nervous about ZFS on i386 given how much memory it wants on 
amd64...

There is a significant issue with MySQL InnoDB logs getting corrupted if 
the system ever does crash/lose power/etc, very reproducible on demand 
on multiple 7.2/7.1/7.0 machines, but is not reproducible in HEAD and 
its newer ZFS v13 (which is why I never opened a pr on it).  For now any 
MySQL masters I run must stay on UFS2, because of that showstopper... 
if anyone wants to try to look at it, I can open a pr or send more 
details, just ask.

I have seen one file get corrupted in a zpool, in two separate instances 
(different machines each time), but was never able to reproduce it ever 
again.  Next time it happens I'll dig into it a bit more.

This is on about seven Core 2 Quad boxes with 8 GB of memory (some have 
only 4) and amd64.  Most disk IO is writing http logs, which can get 
pretty busy on our webservers (usually hundreds of apache processes plus 
nginx, previously lighttpd, serving a few thousand concurrent hits)... 
plus some light maildir, some not-so-light rsync at night...  Most are 
simple mirrored pairs of SATA disks.  A few are hardware raid10 (LSI 
SAS, 3ware SAS) and ZFS is given just a single device... even though I 
know that's not optimal (two hardware raid1's or jbod would be more 
reliable)... those are personal boxes and not production though.   I'm 
not brave enough to attempt booting off of ZFS yet; I use a small 
gmirrored UFS2 for /.  I'm not swapping to ZFS either.