8.1R possible zfs snapshot livelock?

Jeremy Chadwick freebsd at jdc.parodius.com
Tue May 17 11:29:54 UTC 2011


On Tue, May 17, 2011 at 01:48:04PM +0300, Andriy Gapon wrote:
> on 17/05/2011 10:30 Jeremy Chadwick said the following:
> > On Tue, May 17, 2011 at 02:43:44AM -0400, Charles Sprickman wrote:
> >> Does this sound familiar to anyone running ZFS with snapshots?
> > 
> > Yes, and is exactly why I don't use them.  :-)
> 
> You put a smiley, but is this an attempt at FUD?

I wish it were.  I experienced similar behaviour to Charles during the
early 8.x days (possibly 8.1-RELEASE, I forget; I may be thinking of
8.0?) where ZFS snapshots would occasionally result in the kernel
deadlocking on ZFS-bound I/O.  The kernel was alive/responsive to some
degree but ZFS I/O would just indefinitely stall at that point,
requiring a full system reset.  No disk or controller problems (same
hardware I'm using today actually!).

I believe there were commits and improvements for snapshotting committed
between 8.1-RELEASE and 8.2-RELEASE, but I haven't bothered to test
them.  The experience left a very bad taste in my mouth and as such I
have avoided ZFS snapshots since.

I'd be willing to try them again assuming someone can at least confirm
that there were commits done to address snapshot concerns during the
past year or so.  But...

There are still some outstanding incidents that directly pertain to ZFS
snapshots, or are "related" to ZFS snapshots (meaning things like
send/recv which are commonly used alongside snapshots), which I remember
reading about but really saw no answer to:

* ZFS send | ssh zfs recv results in ZFS subsystem hanging; 8.1-RELEASE;
  February 2011:
  http://lists.freebsd.org/pipermail/freebsd-fs/2011-February/010602.html

* Kernel panic during heavy disk I/O while "zfs recv" being used
  simultaneously; CURRENT (so ZFS v28?); April 2011:
  http://lists.freebsd.org/pipermail/freebsd-fs/2011-April/011155.html

* ZFS snapshots taking an extremely long time to be deleted; RELENG_8_1;
  February 2011:
  http://lists.freebsd.org/pipermail/freebsd-fs/2011-February/010797.html

* "zfs destroy -r" not working on filesystem-level snapshots but works
  on pool-level snapshots; RELENG_8 with ZFS v28 patch (and is specific
  to ZFS v28 given the info); May 2011:
  http://lists.freebsd.org/pipermail/freebsd-fs/2011-May/011412.html

Sorry to just rattle off a bunch of URLs and issues at once; it's not my
intention to slander work on ZFS or anything even remotely like that.

I'm just wondering given the number of problem reports that seem to come
in about snapshot or snapshot-related ZFS stuff, where we stand on
these?  This is mainly for Charles' benefit and not so much mine (our
rsnapshot/rsync-based backups work great for us at this time, sans the
stomping of atime).

-- 
| Jeremy Chadwick                                   jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.               PGP 4BD6C0CB |



More information about the freebsd-stable mailing list