panic: solaris assert: rt->rt_space == 0 (0xe000 == 0x0), file: /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/range_tree.c, line: 153
Fabian Keil
freebsd-listen at fabiankeil.de
Sat Feb 14 14:19:03 UTC 2015
Fabian Keil <freebsd-listen at fabiankeil.de> wrote:
> Using an 11.0-CURRENT based on r276255 I just got a panic
> after trying to export a certain ZFS pool:
[...]
> #10 0xffffffff81bdd22f in assfail3 (a=<value optimized out>, lv=<value optimized out>, op=<value optimized out>, rv=<value optimized out>, f=<value optimized out>, l=<value optimized out>)
> at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:91
> #11 0xffffffff8194afc4 in range_tree_destroy (rt=0xfffff80011586000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/range_tree.c:153
> #12 0xffffffff819488bc in metaslab_fini (msp=0xfffff800611a9800) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c:1398
> #13 0xffffffff81965841 in vdev_free (vd=0xfffff8000696d800) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:994
> #14 0xffffffff819657e1 in vdev_free (vd=0xfffff80040532000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:683
> #15 0xffffffff81953948 in spa_unload (spa=0xfffff800106af000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:1314
> #16 0xffffffff81957a58 in spa_export_common (pool=<value optimized out>, new_state=1, oldconfig=0x0, force=<value optimized out>, hardforce=0)
> at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:4540
> #17 0xffffffff81957b08 in spa_export (pool=0x0, oldconfig=0xfffffe0094a624f0, force=128, hardforce=50) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:4574
> #18 0xffffffff8199ed50 in zfs_ioc_pool_export (zc=0xfffffe0006fbf000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c:1618
[...]
> (kgdb) f 11
> #11 0xffffffff8194afc4 in range_tree_destroy (rt=0xfffff80011586000) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/range_tree.c:153
> 153 VERIFY0(rt->rt_space);
[...]
>
> After rebooting and reimporting the pool it looked like this:
>
> fk at r500 ~ $sudo zpool status -v wde4
> pool: wde4
> state: ONLINE
> status: One or more devices has experienced an error resulting in data
> corruption. Applications may be affected.
> action: Restore the file in question if possible. Otherwise restore the
> entire pool from backup.
> see: http://illumos.org/msg/ZFS-8000-8A
> scan: scrub canceled on Tue Jan 20 00:22:26 2015
> config:
>
> NAME STATE READ WRITE CKSUM
> wde4 ONLINE 0 0 19
> label/wde4.eli ONLINE 0 0 76
>
> errors: Permanent errors have been detected in the following files:
>
> <0xaf11f>:<0x0>
> wde4/backup/r500/tank/home/fk:<0x0>
> <0xffffffffffffffff>:<0x0>
>
> The export triggered the same panic again, but with a different rt->rt_space value:
>
> panic: solaris assert: rt->rt_space == 0 (0x22800 == 0x0), file: /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/range_tree.c, line: 153
>
> I probably won't have time to scrub the pool and investigate this further
> until next week.
With this patch and vfs.zfs.recover=1 the pool can be exported without panic:
https://www.fabiankeil.de/sourcecode/electrobsd/range_tree_destroy-Optionally-tolerate-non-zero-rt-r.diff
Warnings from three pool exports:
Feb 14 13:49:22 r500 kernel: [268] Solaris: WARNING: zfs: range_tree_destroy(): rt->rt_space != 0: 12000
Feb 14 13:49:22 r500 kernel: [268] Solaris: WARNING: zfs: range_tree_destroy(): rt->rt_space != 0: 2e200
Feb 14 13:49:22 r500 kernel: [268] Solaris: WARNING: zfs: range_tree_destroy(): rt->rt_space != 0: 12000
Feb 14 13:49:22 r500 kernel: [268] Solaris: WARNING: zfs: range_tree_destroy(): rt->rt_space != 0: 2e200
Feb 14 13:49:22 r500 kernel: [268] Solaris: WARNING: zfs: range_tree_destroy(): rt->rt_space != 0: 12000
Feb 14 13:49:22 r500 kernel: [268] Solaris: WARNING: zfs: range_tree_destroy(): rt->rt_space != 0: 2e200
Feb 14 13:50:25 r500 kernel: [331] Solaris: WARNING: zfs: range_tree_destroy(): rt->rt_space != 0: 11200
Feb 14 13:50:25 r500 kernel: [331] Solaris: WARNING: zfs: range_tree_destroy(): rt->rt_space != 0: 2ea00
Feb 14 13:50:25 r500 kernel: [331] Solaris: WARNING: zfs: range_tree_destroy(): rt->rt_space != 0: 11200
Feb 14 13:50:25 r500 kernel: [331] Solaris: WARNING: zfs: range_tree_destroy(): rt->rt_space != 0: 2ea00
Feb 14 13:50:25 r500 kernel: [331] Solaris: WARNING: zfs: range_tree_destroy(): rt->rt_space != 0: 11200
Feb 14 13:50:25 r500 kernel: [331] Solaris: WARNING: zfs: range_tree_destroy(): rt->rt_space != 0: 2ea00
Feb 14 13:52:27 r500 kernel: [453] Solaris: WARNING: zfs: range_tree_destroy(): rt->rt_space != 0: 12600
Feb 14 13:52:27 r500 kernel: [453] Solaris: WARNING: zfs: range_tree_destroy(): rt->rt_space != 0: 2ea00
Feb 14 13:52:27 r500 kernel: [453] Solaris: WARNING: zfs: range_tree_destroy(): rt->rt_space != 0: 12600
Feb 14 13:52:27 r500 kernel: [453] Solaris: WARNING: zfs: range_tree_destroy(): rt->rt_space != 0: 2ea00
Feb 14 13:52:27 r500 kernel: [453] Solaris: WARNING: zfs: range_tree_destroy(): rt->rt_space != 0: 12600
Feb 14 13:52:27 r500 kernel: [453] Solaris: WARNING: zfs: range_tree_destroy(): rt->rt_space != 0: 2ea00
My impression is that the messages are the result of metaslab_fini() triggering
the problem tree times per export for each tree in msp->ms_defertree.
If the pool is imported readonly, the problem isn't triggered.
Due to interruptions the scrubbing will probably take a couple of days.
ZFS continues to complain about checksum errors but apparently no
affected files have been found yet:
fk at r500 ~ $sudo zpool status -v wde4
pool: wde4
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: scrub in progress since Sat Feb 14 14:19:15 2015
32.0G scanned out of 1.68T at 10.8M/s, 44h25m to go
0 repaired, 1.86% done
config:
NAME STATE READ WRITE CKSUM
wde4 ONLINE 0 0 867
label/wde4.eli ONLINE 0 0 3.39K
errors: Permanent errors have been detected in the following files:
<0xaf11f>:<0x0>
wde4/backup/r500/tank/home/fk:<0x0>
<0xffffffffffffffff>:<0x0>
BTW, any opinions on allowing to change vfs.zfs.recover without reboot?
https://www.fabiankeil.de/sourcecode/electrobsd/Make-vfs.zfs.recover-writable-after-boot.diff
Fabian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20150214/7b322932/attachment.sig>
More information about the freebsd-fs
mailing list