System deadlock when using mksnap_ffs

Thu Nov 13 01:15:52 PST 2008

On Wed, Nov 12, 2008 at 10:05:21PM -0800, Jeremy Chadwick wrote:
> On Wed, Nov 12, 2008 at 09:02:50PM -0800, David Wolfskill wrote:
> > On Wed, Nov 12, 2008 at 08:42:00PM -0800, Jeremy Chadwick wrote:
> > > ...
> > > > > On Wed, Nov 12, 2008 at 05:58:26PM +0000, Tim Bishop wrote:
> > > > > > I've been playing around with snapshots lately but I've got a problem on
> > > > > > one of my servers running 7-STABLE amd64:
> > > > > > 
> > > > > > FreeBSD paladin 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #8: Mon Nov 10 20:49:51 GMT 2008 tdb at paladin:/usr/obj/usr/src/sys/PALADIN  amd64
> > > > > > 
> > > > > > I run the mksnap_ffs command to take the snapshot and some time later
> > > > > > the system completely freezes up:
> > > > > > 
> > > > > > paladin# cd /u2/.snap/
> > > > > > paladin# mksnap_ffs /u2 test.1
> > > > > > 
> > > > > > It only happens on this one filesystem, though, which might be to do
> > > > > > with its size. It's not over the 2TB marker, but it's pretty close. It's
> > > > > > also backed by a hardware RAID system, although a smaller filesystem on
> > > > > > the same RAID has no issues.
> > > ...
> > > Then in my book, the patch didn't fix anything.  :-)  The system is
> > > still "deadlocking"; snapshot generation **should not** wedge the system
> > > hard like this.
> > > 
> > > Also, during my own testing, I am always able to use Ctrl-T to get
> > > SIGINFO from the running process (mksnap_ffs).  That behaviour does not
> > > change for me.
> > > 
> > > The rest of the below information is good -- but I'm confused about
> > > something: is there anyone out there who can use mksnap_ffs on a
> > > filesystem (/usr is a good test source) and NOT experience this
> > > deadlocking problem?
> > 
> > I hadn't ever tried until I saw your message.  Granted, I'm using a
> > smaller file system (I doubt that I have a toital of as much as 2 TB in
> > all my machines combined), and I'm running i386, vs. amd64.  But it ran
> > just fine.  I wasn't able to test SIGINFO; it finished before I had a
> > chance.  (I ran it under time(1); wall clock time was 0.91 sec.)
> > 
> > > Literally *every* FreeBSD box I have root access
> > > to suffers from this problem, so I'm a little baffled why we end-users
> > > need to keep providing debugging output when it should be easy as pie
> > > for a developer to do "dump -0 -L -a -f /path/fs.dump /usr" and watch
> > > their system wedge.
> > 
> > Well, I routinely use dump/restore pipelines to copy file systems
> > around; never had a problem with it.
> > 
> > > ...
> > 
> > For reference:
> > 
> > freebeast(7.1-P)[9] uname -a
> > FreeBSD freebeast.catwhisker.org 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #127: Wed Nov 12 05:16:20 PST 2008     root at freebeast.catwhisker.org:/common/S3/obj/usr/src/sys/FREEBEAST  i386
> > freebeast(7.1-P)[10] ls -la
> > total 4
> > drwxrwxr-x   2 root  operator  512 Nov 12 20:53 .
> > drwxr-xr-x  14 root  wheel     512 Jan 22  2008 ..
> > freebeast(7.1-P)[11] /usr/bin/time -l mksnap_ffs /S2/usr test.1
> >         0.91 real         0.00 user         0.05 sys
> >        976  maximum resident set size
> >          3  average shared memory size
> >        627  average unshared data size
> >        109  average unshared stack size
> >        104  page reclaims
> >          0  page faults
> >          0  swaps
> >          1  block input operations
> >        230  block output operations
> >          0  messages sent
> >          0  messages received
> >          0  signals received
> >        101  voluntary context switches
> >         34  involuntary context switches
> > freebeast(7.1-P)[12] ls -la
> > total 1460
> > drwxrwxr-x   2 root  operator         512 Nov 12 20:54 .
> > drwxr-xr-x  14 root  wheel            512 Jan 22  2008 ..
> > -r--r-----   1 root  operator  2410791056 Nov 12 20:54 test.1
> > freebeast(7.1-P)[13] 
> 
> David, thanks for chiming in.  This is exactly what I was
> fearing/worried about.
> 
> It would be greatly beneficial if we could figure out what triggers the
> slowdown for a lot of us, since for others (proof above) mksnap_ffs
> behaves as expected.
> 
> Since I'm able to reproduce this pretty much everywhere, here's
> information:
> 
> # df -ki /usr
> Filesystem  1024-blocks    Used     Avail Capacity iused    ifree %iused  Mounted on
> /dev/ad4s1f   163815904 3835274 146875358     3%  254864 20941934    1%   /usr
> 
> # cd /usr/.snap
> # /usr/bin/time -l mksnap_ffs /usr test.1
> 
> <after about 20 seconds, hitting Ctrl-T>
> 
> load: 1.90  cmd: mksnap_ffs 11719 [wdrain] 0.00u 0.07s 0% 1092k
>        23.25 real         0.00 user         0.00 sys
> 
>       135.98 real         0.00 user         0.62 sys
>       1092  maximum resident set size
>          4  average shared memory size
>       1081  average unshared data size
>        135  average unshared stack size
>        101  page reclaims
>          0  page faults
>          0  swaps
>        895  block input operations
>      13444  block output operations
>          0  messages sent
>          0  messages received
>          0  signals received
>       6433  voluntary context switches
>        197  involuntary context switches
> # ls -l test.1
> -r--r-----  1 root  operator  173203463240 Nov 12 21:42 test.1
> 
> David's filesystem is 2GBs, while mine is 16GB.  His snap takes under 1
> second, yet mine takes over 2 minutes.
> 
> Possibly the large deviation is explained by the amount of space used on
> the filesystem or the number of inodes in use?

I also want to add that snapshot removal (e.g. rm test.1) is equally as
slow (rm process is also in wdrain); takes about 20 seconds for the
above test.1 snapshot.  Maybe long durations during deletion are
justified though, I don't know.

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |