dump -L on large filesystems + shutdown
Jeremy Chadwick
koitsu at FreeBSD.org
Mon Sep 10 16:40:44 PDT 2007
This weekend I had a very interesting experience with gstripe(8) on
RELENG_6 on amd64. Details of my setup: machine has 4 disks, connected
to a standard SATA300 controller (nForce 4 chipset):
ad4: 476940MB <WDC WD5000AAKS-00TMA0 12.01C01> at ata2-master SATA300
ad6: 476940MB <WDC WD5000AAKS-00TMA0 12.01C01> at ata3-master SATA300
ad8: 190782MB <WDC WD2000JD-00HBB0 08.02D08> at ata4-master SATA150
ad10: 476940MB <Seagate ST3500630AS 3.AAE> at ata5-master SATA300
/dev/ad8s1a 507630 66956 400064 14% /
/dev/ad8s1d 16244334 87212 14857576 1% /var
/dev/ad8s1e 4058062 1778 3731640 0% /tmp
/dev/ad8s1f 32494668 2335866 27559230 8% /usr
/dev/ad8s1g 127763620 6422 117536110 0% /home
/dev/stripe/st0a 946030390 71642044 798705916 8% /storage
/dev/ad10s1d 473009638 70446308 364722560 16% /backups
ad4 = drive #1 in gstripe set (makes /dev/stripe/st0)
ad6 = drive #2 in gstripe set (makes /dev/stripe/st0)
ad8 = boot/OS drive
ad10 = drive used for periodic backups (dump(8) dumps to this disk)
All filesystems, except /, have softupdates enabled. I did not pick
custom block sizes when newfs'ing /storage and /backups.
I have a set of automated backups which run at 02:45 every day. Full
level 0 backups are on Sunday, and increments 1-6 are Mon-Sat.
Backups are done using the following command set:
/sbin/dump -{level} -a -h0 -u -C16 -L -f- /backups/foo.{level}.dump
The incident I'm about to describe happened on Sunday. I was dealing
with an unrelated issue (some Ethernet problems), and I had to reboot
the FreeBSD box in the process. I rebooted it using reboot(8). This
was around 03:05 -- in the middle of the backups.
The first thing I noticed was that the ATA "flush-to-disk" stuff was
taking a long time to hit repetitions of zero (that is: 4 4 4 3 4 2 2 1
1 1 0 0 0). After a few seconds, I saw "0 1 0 1 0 1" start flying by on
the screen over and over at a very fast rate, and after a few more
seconds, I saw the system say "Giving up..." or something like that.
Then it reboot.
When the machine came back up, every filesystem on every disk was
marked dirty.
fsck(8) ran in the background, but took an *incredible* amount of time
to complete on /dev/stripe/st0a (the gstripe set). "Incredible" means
at least an hour, maybe more. I was running gstat during that time, and
the gstripe set was pretty much at 100% utilisation, split 50/50 between
ad4 and ad6; nothing odd there.
The reason I'm mailing -stable about this is because it seems there may
be some sort of "deadlock" condition which can happen when using dump -L
on a system and then shutting it down. Maybe all of this points back to
the ATA subsystem and how long it'll wait for buffers to be flushed to
disk before actually shutting down. In my case, it obviously did not
wait long enough. There don't seem to be any tunables for how long to
continue trying/waiting either.
--
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, USA |
| Making life hard for others since 1977. PGP: 4BD6C0CB |
More information about the freebsd-stable
mailing list