[Bug 206109] zpool import of corrupt pool causes system to reboot
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Sun Jan 10 18:38:10 UTC 2016
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=206109
Bug ID: 206109
Summary: zpool import of corrupt pool causes system to reboot
Product: Base System
Version: 10.2-RELEASE
Hardware: Any
OS: Any
Status: New
Severity: Affects Only Me
Priority: ---
Component: kern
Assignee: freebsd-bugs at FreeBSD.org
Reporter: emilec at clarotech.co.za
I recently setup a new RAIDZ2 pool with 5 x 4TB Seagate NAS drives using
NAS4Free 10.2.0.2 (revision 2235). I discovered after copying data from an
existing NAS to my new pool that there was some corruption detected. I
attempted to run a scrub, but partway through the system crashed and went into
a boot loop.
I reloaded NAS4Free and tried to import the pool, but each time it would reboot
the system. I then tried FreeBSD-10.2-RELEASE-amd64-mini-memstick and an import
of the pool would also cause the system to reboot. I could however mount the
pool read-only and access data.
>From the NAS4Free logs I was able to obtain the following when the system
crashed after attempting an import:
Jan 1 16:21:28 nas4free syslogd: kernel boot file is /boot/kernel/kernel
Jan 1 16:21:28 nas4free kernel: Solaris: WARNING: blkptr at 0xfffffe0003a5fa40
DVA 1 has invalid VDEV 16384
Jan 1 16:21:28 nas4free kernel:
Jan 1 16:21:28 nas4free kernel:
Jan 1 16:21:28 nas4free kernel: Fatal trap 12: page fault while in kernel mode
Jan 1 16:21:28 nas4free kernel: cpuid = 1; apic id = 01
Jan 1 16:21:28 nas4free kernel: fault virtual address = 0x50
Jan 1 16:21:28 nas4free kernel: fault code = supervisor read data,
page not present
Jan 1 16:21:28 nas4free kernel: instruction pointer =
0x20:0xffffffff81e79f94
Jan 1 16:21:28 nas4free kernel: stack pointer =
0x28:0xfffffe0169ef5740
Jan 1 16:21:28 nas4free kernel: frame pointer =
0x28:0xfffffe0169ef5750
Jan 1 16:21:28 nas4free kernel: code segment = base 0x0, limit
0xfffff, type 0x1b
Jan 1 16:21:28 nas4free kernel: = DPL 0, pres 1, long 1, def32 0, gran 1
Jan 1 16:21:28 nas4free kernel: processor eflags = interrupt enabled,
resume, IOPL = 0
Jan 1 16:21:28 nas4free kernel: current process = 6
(txg_thread_enter)
Jan 1 16:21:28 nas4free kernel: trap number = 12
Jan 1 16:21:28 nas4free kernel: panic: page fault
Jan 1 16:21:28 nas4free kernel: cpuid = 1
Jan 1 16:21:28 nas4free kernel: KDB: stack backtrace:
Jan 1 16:21:28 nas4free kernel: #0 0xffffffff80a86a70 at kdb_backtrace+0x60
Jan 1 16:21:28 nas4free kernel: #1 0xffffffff80a4a1d6 at vpanic+0x126
Jan 1 16:21:28 nas4free kernel: #2 0xffffffff80a4a0a3 at panic+0x43
Jan 1 16:21:28 nas4free kernel: #3 0xffffffff80ecaedb at trap_fatal+0x36b
Jan 1 16:21:28 nas4free kernel: #4 0xffffffff80ecb1dd at trap_pfault+0x2ed
Jan 1 16:21:28 nas4free kernel: #5 0xffffffff80eca87a at trap+0x47a
Jan 1 16:21:28 nas4free kernel: #6 0xffffffff80eb0c72 at calltrap+0x8
Jan 1 16:21:28 nas4free kernel: #7 0xffffffff81e8071f at
vdev_mirror_child_select+0x6f
Jan 1 16:21:28 nas4free kernel: #8 0xffffffff81e802d0 at
vdev_mirror_io_start+0x270
Jan 1 16:21:28 nas4free kernel: #9 0xffffffff81e9cd86 at
zio_vdev_io_start+0x1d6
Jan 1 16:21:28 nas4free kernel: #10 0xffffffff81e998b2 at zio_execute+0x162
Jan 1 16:21:28 nas4free kernel: #11 0xffffffff81e991b9 at zio_nowait+0x49
Jan 1 16:21:28 nas4free kernel: #12 0xffffffff81e1c91e at arc_read+0x8fe
Jan 1 16:21:28 nas4free kernel: #13 0xffffffff81e577b2 at
dsl_scan_prefetch+0xc2
Jan 1 16:21:28 nas4free kernel: #14 0xffffffff81e574a3 at
dsl_scan_visitbp+0x583
Jan 1 16:21:28 nas4free kernel: #15 0xffffffff81e5722f at
dsl_scan_visitbp+0x30f
Jan 1 16:21:28 nas4free kernel: #16 0xffffffff81e5722f at
dsl_scan_visitbp+0x30f
Jan 1 16:21:28 nas4free kernel: Copyright (c) 1992-2015 The FreeBSD Project.
status of pool after read-only import:
zpool import -F -f -o readonly=on -R /pool0 pool0
zpool status
pool: pool0
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: scrub in progress since Wed Dec 30 13:34:03 2015
1.06T scanned out of 8.53T at 1/s, (scan is slow, no estimated time)
0 repaired, 12.45% done
config:
NAME STATE READ WRITE CKSUM
pool0 ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
ada0 ONLINE 0 0 0
ada1 ONLINE 0 0 0
ada2 ONLINE 0 0 0
ada3 ONLINE 0 0 0
ada4 ONLINE 0 0 0
errors: 1 data errors, use '-v' for a list
I eventually discovered that the corruption was caused by faulty RAM (fails
memtest). So I accept the pool is corrupt.
Seeing as NAS4Free relies on FreeBSD and the behaviour is the same I thought
this would be the best place to log a bug, but feel free to point me back to
NAS4Free. Their forums however suggested that ZFS is enterprise and enterprise
would simply restore from backup. I believe it would be nice to rather catch
the exception and print an error rather than reboot the system.
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-bugs
mailing list