System deadlock when using mksnap_ffs

Wilko Bulte wb at freebie.xs4all.nl
Wed Nov 12 22:53:06 PST 2008


Quoting Jeremy Chadwick, who wrote on Wed, Nov 12, 2008 at 08:42:00PM -0800 ..
> On Thu, Nov 13, 2008 at 12:41:02AM +0000, Tim Bishop wrote:
> > On Wed, Nov 12, 2008 at 09:47:35PM +0200, Kostik Belousov wrote:
> > > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote:
> > > > I've been playing around with snapshots lately but I've got a problem on
> > > > one of my servers running 7-STABLE amd64:
> > > > 
> > > > FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 10 20:49:51 GMT 2008 tdb at paladin:/usr/obj/usr/src/sys/PALADIN  amd64
> > > > 
> > > > I run the mksnap_ffs command to take the snapshot and some time later
> > > > the system completely freezes up:
> > > > 
> > > > paladin# cd /u2/.snap/
> > > > paladin# mksnap_ffs /u2 test.1
> > > > 
> > > > It only happens on this one filesystem, though, which might be to do
> > > > with its size. It's not over the 2TB marker, but it's pretty close. It's
> > > > also backed by a hardware RAID system, although a smaller filesystem on
> > > > the same RAID has no issues.
> > > > 
> > > > Filesystem  1K-blocks       Used     Avail Capacity  Mounted on
> > > > /dev/da0s1a 2078881084 921821396 990749202    48%    /u2
> > > > 
> > > > To clarify "completely freezes up": unresponsive to all services over
> > > > the network, except ping. On the console I can switch between the ttys,
> > > > but none of them respond. The only way out is to hit the reset button.
> > > 
> > > You need to provide information described in the
> > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html
> > > and especially
> > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html
> > 
> > Ok, I've done that, and removed the patch that seemed to fix things.
> > 
> > The first thing I notice after doing this on the console is that I can
> > still ctrl+t the process:
> > 
> > load: 0.14  cmd: mksnap_ffs 2603 [newbuf] 0.00u 10.75s 0% 1160k
> > 
> > But the top and ps I left running on other ttys have all stopped
> > responding.
> 
> Then in my book, the patch didn't fix anything.  :-)  The system is
> still "deadlocking"; snapshot generation **should not** wedge the system
> hard like this.
> 
> Also, during my own testing, I am always able to use Ctrl-T to get
> SIGINFO from the running process (mksnap_ffs).  That behaviour does not
> change for me.
> 
> The rest of the below information is good -- but I'm confused about
> something: is there anyone out there who can use mksnap_ffs on a
> filesystem (/usr is a good test source) and NOT experience this
> deadlocking problem?  Literally *every* FreeBSD box I have root access
> to suffers from this problem, so I'm a little baffled why we end-users
> need to keep providing debugging output when it should be easy as pie
> for a developer to do "dump -0 -L -a -f /path/fs.dump /usr" and watch
> their system wedge.

dump -L on my RELENG_7 machine does not wedge it.  So there must be
multiple factors influencing the snap creating problems or not.

Wilko


More information about the freebsd-stable mailing list