System deadlock when using mksnap_ffs

Kostik Belousov kostikbel at gmail.com
Thu Nov 13 05:21:15 PST 2008


On Thu, Nov 13, 2008 at 02:45:14AM -0800, Jeremy Chadwick wrote:
> On Thu, Nov 13, 2008 at 12:26:42PM +0200, Kostik Belousov wrote:
> > On Wed, Nov 12, 2008 at 08:42:00PM -0800, Jeremy Chadwick wrote:
> > > On Thu, Nov 13, 2008 at 12:41:02AM +0000, Tim Bishop wrote:
> > > > On Wed, Nov 12, 2008 at 09:47:35PM +0200, Kostik Belousov wrote:
> > > > > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote:
> > > > > > I've been playing around with snapshots lately but I've got a problem on
> > > > > > one of my servers running 7-STABLE amd64:
> > > > > > 
> > > > > > FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 10 20:49:51 GMT 2008 tdb at paladin:/usr/obj/usr/src/sys/PALADIN  amd64
> > > > > > 
> > > > > > I run the mksnap_ffs command to take the snapshot and some time later
> > > > > > the system completely freezes up:
> > > > > > 
> > > > > > paladin# cd /u2/.snap/
> > > > > > paladin# mksnap_ffs /u2 test.1
> > > > > > 
> > > > > > It only happens on this one filesystem, though, which might be to do
> > > > > > with its size. It's not over the 2TB marker, but it's pretty close. It's
> > > > > > also backed by a hardware RAID system, although a smaller filesystem on
> > > > > > the same RAID has no issues.
> > > > > > 
> > > > > > Filesystem  1K-blocks       Used     Avail Capacity  Mounted on
> > > > > > /dev/da0s1a 2078881084 921821396 990749202    48%    /u2
> > > > > > 
> > > > > > To clarify "completely freezes up": unresponsive to all services over
> > > > > > the network, except ping. On the console I can switch between the ttys,
> > > > > > but none of them respond. The only way out is to hit the reset button.
> > > > > 
> > > > > You need to provide information described in the
> > > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html
> > > > > and especially
> > > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html
> > > > 
> > > > Ok, I've done that, and removed the patch that seemed to fix things.
> > > > 
> > > > The first thing I notice after doing this on the console is that I can
> > > > still ctrl+t the process:
> > > > 
> > > > load: 0.14  cmd: mksnap_ffs 2603 [newbuf] 0.00u 10.75s 0% 1160k
> > > > 
> > > > But the top and ps I left running on other ttys have all stopped
> > > > responding.
> > > 
> > > Then in my book, the patch didn't fix anything.  :-)  The system is
> > > still "deadlocking"; snapshot generation **should not** wedge the system
> > > hard like this.
> > You systematically mix two completely different issues:
> > - first one is the _deadlock_ experienced by Tim;
> 
> Re-read what he wrote.  Quote:
> 
> "Ok, I've done that, and removed the patch that seemed to fix things.
> 
> The first thing I notice after doing this on the console is that I can
> still ctrl+t the process:
> 
> load: 0.14  cmd: mksnap_ffs 2603 [newbuf] 0.00u 10.75s 0% 1160k
> 
> But the top and ps I left running on other ttys have all stopped
> responding."
> 
> If he can press Control-T, it means SIGINFO can be sent to the
> mksnap_ffs process, and the process responds with that information.  So,
> the system is not deadlocked -- meaning, I believe what he experiences
> is what others experience (the system becomes completely unusable during
> mksnap_ffs running, but DOES NOT hang or lock up, it just becomes so
> god-awful slow that processes on the machine literally sit and spin for
> minutes at a time).

Unless NOKERNINFO is specified in the local flags in the controlling
terminal termios, kernel prints one line summary as shown above. This is
done from the tty discipline input handler (or whatever it is in new tty
code). No process cooperation is required. On the other hand, actually
delivering SIGINFO and getting output from the process-installed
handler do require process to either executing usermode or sleeping
interruptible.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081113/5cf5489c/attachment.pgp


More information about the freebsd-stable mailing list