bin/121684: : dump(8) frequently hangs
oberman at es.net
Mon Sep 1 23:51:47 UTC 2008
> Date: Mon, 1 Sep 2008 14:38:56 -0700
> From: Jeremy Chadwick <koitsu at FreeBSD.org>
> On Mon, Sep 01, 2008 at 09:00:12AM -0700, Kevin Oberman wrote:
> > > Date: Mon, 01 Sep 2008 09:36:11 -0400
> > > From: Mike Tancsa <mike at sentex.net>
> > > Sender: owner-freebsd-stable at freebsd.org
> > >
> > > At 05:07 AM 9/1/2008, Derek Kuli??ski wrote:
> > >
> > > >Now I'm honestly a bit scared about it (even if it will be fixed
> > > >before 7.1, I'm not sure I'll hurry with the update).
> > >
> > > There have been a number of commits to releng_7
> > > that fixed dump issues for me. A box that used
> > > to regularly exhibit hung dump processes have
> > > been working fine since April. e.g. a kernel from
> > > 7.0-STABLE FreeBSD 7.0-STABLE #4: Wed Apr 30
> > >
> > > does weekly level 0 dumps and daily differential
> > > dumps on the file systems below without issue
> > > % df -i
> > > Filesystem 1K-blocks Used Avail
> > > Capacity iused ifree %iused Mounted on
> > > /dev/twed0s1a 2026030 284346 1579602 15% 2937 279685 1% /
> > > devfs 1 1 0
> > > 100% 0 0 100% /dev
> > > /dev/twed0s1d 5077038 575828 4095048
> > > 12% 1197 658257 0% /tmp
> > > /dev/twed0s1e 20308398 11072840 7610888
> > > 59% 1065406 1572416 40% /usr
> > > /dev/twed0s1f 20308398 13275050 5408678
> > > 71% 13750 2624072 1% /var
> > > /dev/twed0s1g 246875258
> > > 186393906 40731332 82% 9118036 22794922 29% /zoo
> > >
> > > However, you should test and make sure it works for you.
> > I have a 7-Stable system which has not been able to successfully dump(8)
> > for about 2 months. Since it contains almost no important data that is
> > subject to change, it's not too big a deal, but I worry that other
> > systems might start showing the same problems.
> > I have no idea why it's failing, though, and I have spent little effort
> > in troubleshooting it. I'm running 3 week old stable and I'll be
> > updating to today's RELENG_7 later today.
> Can someone explain what "dump frequently hangs" actually means?
> Does it lock up the entire machine indefinitely (and if so, how long did
> you wait for it to (hopefully) recover)?
> Or does it more or less "deadlock" the machine, making it generally
> unusable, until the dump is completely finished?
> If the latter, I can confirm this problem -- which is why we moved all
> of our production systems away from using dump on UFS2 to simply using
> rsnapshot. I'll try to find the thread (it was a year or so ago)
> where a developer told me more or less what was going on. The problem
> was that UFS2 snapshot generation, over time, becomes slower and slower
> to generate (this is what dump does on UFS2 systems, with or without the
> -L flag), and is a known design issue.
> If anything, this issue makes ZFS incredibly important with regards to
> -STABLE, where its snapshot generation for backups does not behave this
> was; fast and very easily managable.
> : rsync is great for backups, and very fast, but there's the issue of
> modifying atimes. I committed a patch to ports/net/rsync which adds an
> --atimes flag, except its behaviour is not what you'd expect: the file
> which was copied, at the destination, has the correct atime (of the
> source), but the source itself ends up getting its atime modified, so
> you're essentially destroying the atime data on the source.
> This is a problem when it comes to programs which use atime to discern
> things, such as classic UNIX mailboxes/mbox. "Um, why does mutt say I
> don't have any new mail when I do??" In our case, the only person using
> classic UNIX mboxes with a mail client local to the machine was me, so I
> ended up migrating my procmail rules and data to Maildir using mutt,
> solving the problem entirely.
> | Jeremy Chadwick jdc at parodius.com |
> | Parodius Networking http://www.parodius.com/ |
> | UNIX Systems Administrator Mountain View, CA, USA |
> | Making life hard for others since 1977. PGP: 4BD6C0CB |
In my case the dump deadlocks, but the system is unaffected. The dump
just freezes. I need to look at it more closely, but I simply have not
had time. I don't even recall what state it is in when frozen, but it
can be 'kill -9'ed. The problem has persisted through at least one system
I'll try to track down more tomorrow.
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: oberman at es.net Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 224 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20080901/2b95e488/attachment.pgp
More information about the freebsd-stable