Snapshot corruption on 6.1/amd64

Lapo Luchini lapo at lapo.it
Mon Nov 13 15:20:39 UTC 2006


Kostik Belousov <kostikbel <at> gmail.com> writes:

> > >After some searching, I've found a bug report filed last year that
> > >describes this problem exactly, though the log of that report does
> > >not suggest that anything has been done with it.  That report is at
> > >http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/90512
>
> James, look at the PR/100365. Supposed fix is MFCed. Original reporter
> said that this changed nothing for him. I have not much time lately to
> look at this problem, but would like to get additional data points.
>
> BTW, use of snapshots with stock 6.1 is not very attractive idea, better
> to update to the 6-STABLE (many important fixes in that area were made).

I had a problem with snapshots too, and I also use amd64.
The description of neither PR seem to match my case: I compiled 6.1-STABLE at
the beginning of September and activated snapshots on the whole /usr FS and had
no problems until the beginning of October, when:

Oct 22 04:00:33 motoko root: snapshot: daily.0 snapshot on filesystem / made
(duration: 0 min)
Oct 22 04:03:29 motoko root: snapshot: daily.0 snapshot on filesystem /usr made
(duration: 2 min)
Oct 22 04:03:47 motoko root: snapshot: daily.0 snapshot on filesystem /var made
(duration: 0 min)
[machine manually reset]
Oct 23 11:09:21 motoko syslogd: kernel boot file is /boot/kernel/kernel
Oct 23 11:09:21 motoko kernel: Copyright (c) 1992-2006 The FreeBSD Project.
[...]
Oct 23 11:11:02 motoko fsck: /dev/ad0s1e: 4449 files, 118197 used, 135618 free
(8882 frags, 15842 blocks, 3.5% fragmentation)
Oct 23 11:11:18 motoko fsck: /dev/ad0s1d: UNREF FILE I=23564  OWNER=operator
MODE=100400
[...many more...]
Oct 23 11:11:19 motoko fsck: /dev/ad0s1d: UNREF FILE I=212299  OWNER=www
MODE=100600
Oct 23 11:11:19 motoko fsck: /dev/ad0s1d: SIZE=2048 MTIME=Oct  1 15:57 2006
(CLEARED)
Oct 23 11:11:19 motoko fsck: /dev/ad0s1d: Reclaimed: 0 directories, 1991 files,
1832 fragments
Oct 23 11:11:19 motoko fsck: /dev/ad0s1d: 18768 files, 83120 used, 915663 free
( 6839 frags, 113603 blocks, 0.7% fragmentation)
Oct 23 11:13:49 motoko ntpd[670]: kernel time sync disabled 2041
Oct 23 11:21:10 motoko syslogd: kernel boot file is /boot/kernel/kernel
Oct 23 11:21:10 motoko kernel: panic: snapblkfree: inconsistent block type
Oct 23 11:21:10 motoko kernel: Uptime: 20m38s
Oct 23 11:21:10 motoko kernel: Cannot dump. No dump device defined.
Oct 23 11:21:10 motoko kernel: Automatic reboot in 15 seconds - press a key on
the console to abort
Oct 23 11:21:10 motoko kernel: Copyright (c) 1992-2006 The FreeBSD Project.

And after this the box kinda looped 27 times { fsck; panic; reset; } until it
finally crashed for good.

I then decided to stop taking new snapshots and activate a dump device, but
after a few days a new problem was there:

Dump header from device /dev/ad0s1b
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 1056505856B (1007 MB)
  Blocksize: 512
  Dumptime: Fri Nov  3 04:25:36 2006
  Hostname: motoko.lapo.it
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 6.1-STABLE #4: Fri Sep  1 17:02:50 CEST 2006
    root at motoko.lapo.it:/usr/obj/usr/src/sys/MOTOKO
  Panic String: snapacct_ufs2: bad block
  Dump Parity: 2648692799
  Bounds: 1
  Dump Status: good

I solved this removing any existing snapshot, but at this time I had accumulated
enough downtime and frustration (and angry users) not to want to try snapshots
anymore unless I had some strong impression the problem could really have been
solved, which kinda explains why I noticed this thread... the obvious question
is: may this problem be resolved by PR/100365 (seems quite different to me, but
I don't know the internals...) or is it a new thing?

I have the dump file, for the latest problem.



More information about the freebsd-fs mailing list