8.1R possible zfs snapshot livelock?

Tue May 17 07:30:31 UTC 2011

On Tue, May 17, 2011 at 02:43:44AM -0400, Charles Sprickman wrote:
> Not sure if it's worth troubleshooting this too much before
> upgrading, but we recently had an 8.1R/amd64 box hang in a way that
> suggested everything was waiting on disk access.  It's remote and we
> had to resort to a power-cycle to bring it back (we have serial
> console, but it hung after accepting the root password).
> 
> We run hourly/daily/weekly/monthly snapshots on about a half dozen
> filesystems using RSE's snaphot script (see
> http://people.freebsd.org/~rse/snapshot/ - we only use the zfs
> snapshotting and do not use the amd portion).  We have some basic
> stats logged on all our boxes every 5 minutes and I saw a pile of
> cron jobs stuck in disk I/O wait.  I suspect these were the
> snapshots.  Shortly after that it seems as if all disk I/O got hung.
> 
> Some additional info about what the main tasks are on this box:
> 
> -qmail deliveries (lots)
> -postgres (light use)
> -nfs export of qmail log dirs to another box that does log analysis
> 
> All services are spread amongst a handful of jails.  Each jail has
> it's out zfs filesystem.
> 
> Does this sound familiar to anyone running ZFS with snapshots?

Yes, and is exactly why I don't use them.  :-)  The problem sounds more
like the kernel didn't lock up waiting for disk I/O (you didn't see any
disk or controller issues on the console), but more likely bugs with ZFS
snapshots or ZFS itself (kernel thread deadlock).  We're talking about
FreeBSD 8.1-RELEASE here; ZFS innards have changed greatly between then
and now.

Understandably (and justified), folks will almost certainly recommend
that you upgrade the machine to RELENG_8 (8.2-STABLE) and see if the
problem recurs.  If so, you'll probably need to drop the machine to DDB
remotely (via serial console) and issue some commands per whatever a
kernel developer tells you.

If this is a production machine, doing that probably isn't possible (it
may take days or weeks before someone gets back to you), so the best
thing to do would be to ensure you have dumpdev="auto" (or a specific
chosen device of your choice) to dump all memory to swap, and a /var
filesystem large enough to hold it all, then drop to DDB + induce a
panic by issuing "call doadump".  "reboot", then let savecore(8) find
the kernel dump in swap, save it to files in /var/crash, which can then
be later examined using kgdb.

> Anything I should log to get more data if this happens again?  I
> have output from arc_summary.pl running every 5 minutes as part of
> our general status logging.
> 
> Any pointers to known issues in ZFS (both 8.1 an 8.2) would be helpful.

There's a whole ton of issues, but noting them all is virtually
impossible at this point.  CVS commits / cvsweb are probably a better
way to see what's been fixed.  I've been screaming for years about the
need for concise documentation every time the ZFS code (in RELENG_8 at
least) is touched with an explanation of what the problem was and what
was fixed; I've since given up that effort.

> Also, anywhere to look for the general state of ZFS besides this page?
> 
> http://wiki.freebsd.org/ZFS

The freebsd-fs and freebsd-stable mailing lists are pretty much the
source of truth these days.  Basically if you use ZFS you're sort of
expected to be subscribed to them and following them daily.

-- 
| Jeremy Chadwick                                   jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.               PGP 4BD6C0CB |