kern/90512: Snapshot corruption after fs activity
Nate Eldredge
nge at cs.hmc.edu
Fri Dec 16 11:20:21 PST 2005
>Number: 90512
>Category: kern
>Synopsis: Snapshot corruption after fs activity
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Fri Dec 16 19:20:03 GMT 2005
>Closed-Date:
>Last-Modified:
>Originator: Nate Eldredge
>Release: FreeBSD 6.0-RELEASE amd64
>Organization:
>Environment:
System: FreeBSD vulcan.lan 6.0-RELEASE FreeBSD 6.0-RELEASE #0: Wed Dec 14 20:08:57 PST 2005 nate at vulcan.lan:/usr/obj/usr/src/sys/VULCAN amd64
>Description:
When you use mksnap_ffs to make a snapshot on a filesystem which then
has a lot of stuff deleted and re-created, the snapshot becomes corrupt.
I think this is fairly serious since snapshots may be used for backup
purposes. That's how I originally discovered the problem; I made a
snapshot on /usr before making a bunch of changes, during which I
accidentally moved most of /usr/local to another partition :). I moved
it back but wanted to verify that everything was back as it was,
which is when I discovered my snapshot was no good.
Note this is on amd64. I have not tried i386.
>How-To-Repeat:
# dd if=/dev/zero of=snaptest.img bs=1024k count=1000
# mdconfig -a -t vnode -f snaptest.img
md0
# newfs /dev/md0
# mount /dev/md0 /mnt/md0
# cd /mnt/md0
# tar xjf /usr/ports/distfiles/gap/gap4r4p6.tar.bz2
# mksnap_ffs /mnt/md0 /mnt/md0/.snap/snap1
# mdconfig -a -t vnode -f .snap/snap1
WARNING: opening backing store: /mnt/md0/.snap/snap1 readonly
md1
# mount -r /dev/md1 /mnt/md1
###### inspecting /mnt/md1 reveals the snapshot is apparently okay
# rm -r gap4r4
###### snapshot still apparently okay
# !tar
tar xjf /usr/ports/distfiles/gap/gap4r4p6.tar.bz2
# ls -l /mnt/md1/gap4r4
ls: Makefile.in: Bad file descriptor
ls: bin: Bad file descriptor
ls: cnf: Bad file descriptor
ls: configure: Bad file descriptor
ls: doc: Bad file descriptor
ls: etc: Bad file descriptor
ls: gap.shi: Bad file descriptor
ls: grp: Bad file descriptor
ls: pkg: Bad file descriptor
ls: prim: Bad file descriptor
ls: small: Bad file descriptor
ls: src: Bad file descriptor
ls: sysinfo.in: Bad file descriptor
ls: trans: Bad file descriptor
ls: tst: Bad file descriptor
total 38
-rw-r--r-- 1 nate nate 4782 Aug 29 06:19 README
-rw-r--r-- 1 nate nate 9725 May 11 2005 description4r4p5
-rw-r--r-- 1 nate nate 11660 Aug 29 06:05 description4r4p6
drwxr-xr-x 2 nate nate 9728 Aug 30 06:27 lib
Doing truss on ls reveals that lstat() is returning EBADF on the offending
files (which doesn't make any sense as there is no file descriptor involved;
EIO might be better). Also, umounting and then fscking /dev/md1
produces a cornucopia of errors, including as a representative sample:
PARTIALLY TRUNCATED INODE I=70662
3689066227402421815 BAD I=70662
4121129229942796344 BAD I=70662
3833180345978203193 BAD I=70662
4051046384641915184 BAD I=70662
3688509874569295664 BAD I=70662
3472592161990062385 BAD I=70662
3906084542581519160 BAD I=70662
4049637910162848049 BAD I=70662
4123381021216356400 BAD I=70662
3979273551213759020 BAD I=70662
4051327820913194809 BAD I=70662
EXCESSIVE BAD BLKS I=70662
INCORRECT BLOCK COUNT I=70662 (960 should be 736)
PARTIALLY TRUNCATED INODE I=70719
UNALLOCATED I=23552 OWNER=nate MODE=0
DIRECTORY CORRUPTED I=70660 OWNER=nate MODE=40755
MISSING '.' I=71129 OWNER=nate MODE=40755
SIZE=1536 MTIME=Aug 30 06:27 2005
UNREF DIR I=117760 OWNER=nate MODE=40755
SIZE=512 MTIME=Aug 30 06:27 2005
LINK COUNT DIR I=2 OWNER=root MODE=40755
SIZE=512 MTIME=Dec 16 10:34 2005 COUNT 4 SHOULD BE 3
The original filesystem /dev/md0 apparently
remains okay and fsck reports no errors for it.
There are no kernel error messages this time, though a previous attempt
(when the snapshot was on /dev/md0) yielded
/mnt/md0: bad dir ino 3182535 at offset 0: mangled entry
/mnt/md0: bad dir ino 2953 at offset 0: mangled entry
...4 or 5 more...
Also at that time there were directories which changed to files of size 1
which dumped many, many bytes of garbage when cat'ted.
>Fix:
Unknown.
Thanks!
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-bugs
mailing list