Crash in vdev_dtl_reassess()
Dave Baukus
daveb at spectralogic.com
Tue Sep 3 23:59:48 UTC 2019
On FreeBSD 12.0-STABLE I have a panic in vdev_dtl_reassess()
because spa->spa_dsl_pool->dp_scan is NULL
#7 trap (frame=0xfffffe026d6e07d0) at sys/amd64/amd64/trap.c:443
#8 <signal handler called>
#9 vdev_dtl_reassess (vd=0xfffff80010661000, txg=0, scrub_txg=0, scrub_done=0) at sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:2519
#10 vdev_dtl_reassess (vd=0xfffff8001c6d7000, txg=0, scrub_txg=0, scrub_done=0) at sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:2512
#11 vdev_dtl_reassess (vd=0xfffff80318df5000, txg=0, scrub_txg=0, scrub_done=0) at sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:2512
#12 spa_vdev_state_exit (spa=0xfffffe01bd7fe000, vd=0x0, error=0) at sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c:1396
#13 spa_async_thread_vd (arg=0xfffffe01bd7fe000) at sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:7320
line 2519 of vdev.c is:
if (vd->vdev_ops->vdev_op_leaf) {
dsl_scan_t *scn = spa->spa_dsl_pool->dp_scan;
...
I believe I know how I got here, but the proper solution is not obvious to me.
I have a 3 part train wreck:
Part1:
/usr/tests/sys/devad/devad_test is executing turning disk PHYs on and off.
Part2:
A process is executing zpool import (no arguments); this leads to
spa_tryimport(). I know spa_load() failed because spa->spa_load_state == SPA_LOAD_ERROR
and I conclude that the root failure occurred in dsl_pool_init() as called
from spa_ld_open_rootbp() because spa->spa_dsl_pool and spa->spa_meta_objset
are both NULL.
#3 _cv_wait (cvp=0xfffffe01bd7fef48, lock=0xfffffe01bd7fef18) at sys/kern/kern_condvar.c:146
#4 spa_config_enter (spa=0xfffffe01bd7fe000, locks=3, tag=<optimized out>, rw=RW_READER) at sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c:559
#5 spa_config_generate (spa=0xfffffe01bd7fe000, vd=0xfffff80318df5000, txg=18446744073709551615, getstats=1) at sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c:415
#6 spa_tryimport (tryconfig=<optimized out>) at sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:5642
#7 zfs_ioc_pool_tryimport (zc=0xfffffe02331fc000) at sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c:1756
#8 zfsdev_ioctl (dev=<optimized out>, zcmd=<optimized out>, arg=0xfffffe00ff4238c0 "\a", flag=<optimized out>, td=<optimized out>) at sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c:6804
Part3:
The spa_async_thread_vd() thread is processing a (tasks & SPA_ASYNC_REMOVE) event for
the same spa that spa_tryimport() waiting to complete processing.
Thus, I have the spa_async_thread_vd() re-cursing through the vdevs of an ephemeral spa that is
not fully initialized and we crash when spa_async_thread_vd() dereferences spa->spa_dsl_pool.
It seems that a spa that is being constructed simply to glean the pool's configuration
should be tagged so that spa_async_thread_vd() and others don't assume it is fully
constructed.
--
Dave Baukus
More information about the freebsd-test
mailing list