Trouble with snapshots

Kris Kennaway kris at FreeBSD.org
Tue Apr 1 13:19:10 PDT 2008


Cyrus Rahman wrote:
> I'm seeing serious problems with snapshot deadlocks on 7.0-RELEASE
> right now.  I haven't been able to set up a test environment to really
> determine precise details, but this much I know:  Filesystem i/o will
> eventually lock up, requiring a hard reset, after the snapshot mount
> sleeps permanently on suspfs.  Eventually there's a cascade and
> everything ends up waiting on suspfs.  Running a 'sync' after mount
> hangs is a sure way to propagate the problem.  This happens very often
> - probably 15% probability per snapshot on the server running 7.0.
> It's bad enough so that it's not realistic to use snapshots there.
> Other strange things have been observed, in that an entire day's worth
> of work vanished - after the reset/reboot the filesystems were consistent,
> but in the state they were in many hours before, at the time the snapshot
> hung.  The snapshot had been observed hanging, but everything else seemed
> to work so a decision was made to reboot at the end of the day - with
> disastrous effect!  During the day nothing unusual except for the hung
> snapshot was noticed.  I'm guessing everything just got cached (for
> hours!) and the cache never got flushed.
> 
> This is happening on a system set up with journaled ufs filesystems,
> so that may be part of the problem.  The system is running amd64 with
> an Intel Q6600.

I thought gjournal and soft updates were supposed to be mutually 
exclusive (the latter is required for UFS snapshots).  Anyway, even if 
they are supposed to work together this interaction is almost certainly 
the cause.

Kris


More information about the freebsd-fs mailing list