bin/121684: : dump(8) frequently hangs
Jeremy Chadwick
koitsu at FreeBSD.org
Mon Sep 1 21:38:58 UTC 2008
On Mon, Sep 01, 2008 at 09:00:12AM -0700, Kevin Oberman wrote:
> > Date: Mon, 01 Sep 2008 09:36:11 -0400
> > From: Mike Tancsa <mike at sentex.net>
> > Sender: owner-freebsd-stable at freebsd.org
> >
> > At 05:07 AM 9/1/2008, Derek Kuli??ski wrote:
> >
> > >Now I'm honestly a bit scared about it (even if it will be fixed
> > >before 7.1, I'm not sure I'll hurry with the update).
> >
> > There have been a number of commits to releng_7
> > that fixed dump issues for me. A box that used
> > to regularly exhibit hung dump processes have
> > been working fine since April. e.g. a kernel from
> > 7.0-STABLE FreeBSD 7.0-STABLE #4: Wed Apr 30
> >
> > does weekly level 0 dumps and daily differential
> > dumps on the file systems below without issue
> > % df -i
> > Filesystem 1K-blocks Used Avail
> > Capacity iused ifree %iused Mounted on
> > /dev/twed0s1a 2026030 284346 1579602 15% 2937 279685 1% /
> > devfs 1 1 0
> > 100% 0 0 100% /dev
> > /dev/twed0s1d 5077038 575828 4095048
> > 12% 1197 658257 0% /tmp
> > /dev/twed0s1e 20308398 11072840 7610888
> > 59% 1065406 1572416 40% /usr
> > /dev/twed0s1f 20308398 13275050 5408678
> > 71% 13750 2624072 1% /var
> > /dev/twed0s1g 246875258
> > 186393906 40731332 82% 9118036 22794922 29% /zoo
> >
> > However, you should test and make sure it works for you.
>
> I have a 7-Stable system which has not been able to successfully dump(8)
> for about 2 months. Since it contains almost no important data that is
> subject to change, it's not too big a deal, but I worry that other
> systems might start showing the same problems.
>
> I have no idea why it's failing, though, and I have spent little effort
> in troubleshooting it. I'm running 3 week old stable and I'll be
> updating to today's RELENG_7 later today.
Can someone explain what "dump frequently hangs" actually means?
Does it lock up the entire machine indefinitely (and if so, how long did
you wait for it to (hopefully) recover)?
Or does it more or less "deadlock" the machine, making it generally
unusable, until the dump is completely finished?
If the latter, I can confirm this problem -- which is why we moved all
of our production systems away from using dump on UFS2 to simply using
rsnapshot[1]. I'll try to find the thread (it was a year or so ago)
where a developer told me more or less what was going on. The problem
was that UFS2 snapshot generation, over time, becomes slower and slower
to generate (this is what dump does on UFS2 systems, with or without the
-L flag), and is a known design issue.
If anything, this issue makes ZFS incredibly important with regards to
-STABLE, where its snapshot generation for backups does not behave this
was; fast and very easily managable.
[1]: rsync is great for backups, and very fast, but there's the issue of
modifying atimes. I committed a patch to ports/net/rsync which adds an
--atimes flag, except its behaviour is not what you'd expect: the file
which was copied, at the destination, has the correct atime (of the
source), but the source itself ends up getting its atime modified, so
you're essentially destroying the atime data on the source.
This is a problem when it comes to programs which use atime to discern
things, such as classic UNIX mailboxes/mbox. "Um, why does mutt say I
don't have any new mail when I do??" In our case, the only person using
classic UNIX mboxes with a mail client local to the machine was me, so I
ended up migrating my procmail rules and data to Maildir using mutt,
solving the problem entirely.
--
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, USA |
| Making life hard for others since 1977. PGP: 4BD6C0CB |
More information about the freebsd-stable
mailing list