ZFS kernel panics due to corrupt DVAs (despite RAIDZ)
Raymond Jimenez
raymondj at caltech.edu
Mon Nov 26 22:10:21 UTC 2012
Hello,
We recently sent our drives out for data recovery (blown drive
electronics), and when we got the new drives/data back, ZFS
started to kernel panic whenever listing certain items in a
directory, or whenever a scrub is close to finishing (~99.97%)
The zpool worked fine before data recovery, and most of the
files are accessible (only a couple hundred unavailable out of
several million).
Here's the kernel panic output if I scrub the disk:
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address = 0x38
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff810792d1
stack pointer = 0x28:0xffffff8235122720
frame pointer = 0x28:0xffffff8235122750
code segment = base 0x0, limit 0xffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 52 (txg_thread_enter)
[thread pid 52 tid 101230 ]
Stopped at vdev_is_dead+0x1: cmpq $0x5, 0x38(%rdi)
$rdi is zero, so this seems to be just a null pointer exception.
The vdev setup looks like:
pool: mfs-zpool004
state: ONLINE
scan: scrub canceled on Mon Nov 26 05:40:49 2012
config:
NAME STATE READ WRITE CKSUM
mfs-zpool004 ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
gpt/lenin3-drive8 ONLINE 0 0 0
gpt/lenin3-drive9.eli ONLINE 0 0 0
gpt/lenin3-drive10 ONLINE 0 0 0
gpt/lenin3-drive11.eli ONLINE 0 0 0
raidz1-1 ONLINE 0 0 0
gpt/lenin3-drive12 ONLINE 0 0 0
gpt/lenin3-drive13.eli ONLINE 0 0 0
gpt/lenin3-drive14 ONLINE 0 0 0
gpt/lenin3-drive15.eli ONLINE 0 0 0
errors: No known data errors
The initial scrub fixed some data (~24k) in the early stages, but
also crashed at 99.97%.
Right now, I'm using an interim work-around patch[1] so that our
users can get files without worrying about crashing the server.
It's a small check in dbuf_findbp() that checks if the DVA that will
be returned has a small (=<16) vdev number, and if not, returns EIO.
This just results in ZFS returning I/O errors for any of the corrupt
files I try to access, which at least lets us get at our data for now.
My suspicion is that somehow, bad data is getting interpreted as
a block pointer/shift constant, and this sends ZFS into the woods.
I haven't been able to track down how this data could get past
checksum verification, especially with RAIDZ.
Backtraces:
(both crashes due to vdev_is_dead() dereferencing a null pointer)
Scrub crash:
http://wsyntax.com/~raymond/zfs/zfs-scrub-bt.txt
Prefetch off, ls -al of "/06/chunk_0000000001417E06_00000001.mfs":
http://wsyntax.com/~raymond/zfs/zfs-ls-bt.txt
Regards,
Raymond Jimenez
[1] http://wsyntax.com/~raymond/zfs/zfs-dva-corrupt-workaround.patch
More information about the freebsd-fs
mailing list