kernel panic loop after zpool remove

Thu Jun 4 14:11:11 UTC 2020


Is there anyway to get zfs and the zpools into a recovery 'safe-mode' where
it stops trying to do any manipulations of the zpools?  I'm stuck in a
kernel panic loop after 'zpool remove'.  With the recent discussions about
openzfs I was also considering trying to switch to openzfs but don't want
to get myself into more trouble!  I just need to stop the remove vdev
operation and this 'metaslab free' code flow from running on the new vdev
so I can keep the system up long enough to read the data out.

Thank you,

The system is running 'FreeBSD 12.1-RELEASE-p5 GENERIC amd64'.
panic: solaris assert: ((offset) & ((1ULL << vd->vdev_ashift) - 1)) == 0
(0x400 == 0x0), file:
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c, line:
cpuid = 1
time = 1590769420
KDB: stack backtrace:
#0 0xffffffff80c1d307 at kdb_backtrace+0x67
#1 0xffffffff80bd063d at vpanic+0x19d
#2 0xffffffff80bd0493 at panic+0x43
#3 0xffffffff82a6922c at assfail3+0x2c
#4 0xffffffff828a3b83 at metaslab_free_concrete+0x103
#5 0xffffffff828a4dd8 at metaslab_free+0x128
#6 0xffffffff8290217c at zio_dva_free+0x1c
#7 0xffffffff828feb7c at zio_execute+0xac
#8 0xffffffff80c2fae4 at taskqueue_run_locked+0x154
#9 0xffffffff80c30e18 at taskqueue_thread_loop+0x98
#10 0xffffffff80b90c53 at fork_exit+0x83
#11 0xffffffff81082c2e at fork_trampoline+0xe
Uptime: 7s

I have a core dump and can provide more details to help debug the issue and
open a bug tracker too:
(kgdb) bt
#0  __curthread () at /usr/src/sys/amd64/include/pcpu.h:234
#1  doadump (textdump=<optimized out>) at
#2  0xffffffff80bd0238 in kern_reboot (howto=260) at
#3  0xffffffff80bd0699 in vpanic (fmt=<optimized out>, ap=<optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:877
#4  0xffffffff80bd0493 in panic (fmt=<unavailable>) at
#5  0xffffffff82a6922c in assfail3 (a=<unavailable>, lv=<unavailable>,
op=<unavailable>, rv=<unavailable>, f=<unavailable>, l=<optimized out>)
    at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:91
#6  0xffffffff828a3b83 in metaslab_free_concrete (vd=0xfffff80004623000,
offset=137438954496, asize=<optimized out>, checkpoint=0)
#7  0xffffffff828a4dd8 in metaslab_free_dva (spa=<optimized out>,
checkpoint=0, dva=<optimized out>)
#8  metaslab_free (spa=<optimized out>, bp=0xfffff800043788a0,
txg=41924766, now=<optimized out>)
#9  0xffffffff8290217c in zio_dva_free (zio=0xfffff80004378830) at
#10 0xffffffff828feb7c in zio_execute (zio=0xfffff80004378830) at
#11 0xffffffff80c2fae4 in taskqueue_run_locked (queue=0xfffff80004222800)
at /usr/src/sys/kern/subr_taskqueue.c:467
#12 0xffffffff80c30e18 in taskqueue_thread_loop (arg=<optimized out>) at
#13 0xffffffff80b90c53 in fork_exit (callout=0xffffffff80c30d80
<taskqueue_thread_loop>, arg=0xfffff800041d90b0, frame=0xfffffe004dcf9bc0)
    at /usr/src/sys/kern/kern_fork.c:1065
#14 <signal handler called>
(kgdb) frame 6
#6  0xffffffff828a3b83 in metaslab_free_concrete (vd=0xfffff80004623000,
offset=137438954496, asize=<optimized out>, checkpoint=0)
3593            VERIFY0(P2PHASE(offset, 1ULL << vd->vdev_ashift));
(kgdb) p /x offset
$1 = 0x2000000400
(kgdb) p /x vd->vdev_ashift
$2 = 0xc
(kgdb) frame 9
#9  0xffffffff8290217c in zio_dva_free (zio=0xfffff80004378830) at
3070            metaslab_free(zio->io_spa, zio->io_bp, zio->io_txg,
(kgdb) p /x zio->io_spa->spa_root_vdev->vdev_child[0]->vdev_removing
$3 = 0x1
(kgdb) p /x zio->io_spa->spa_root_vdev->vdev_child[0]->vdev_ashift
$4 = 0x9
(kgdb) p /x zio->io_spa->spa_root_vdev->vdev_child[1]->vdev_ashift
$5 = 0xc
(kgdb) frame 8
#8  metaslab_free (spa=<optimized out>, bp=0xfffff800043788a0,
txg=41924766, now=<optimized out>)
4145                            metaslab_free_dva(spa, &dva[d], checkpoint);
(kgdb) p /x *bp
$6 = {blk_dva = {{dva_word = {0x100000002, 0x10000002}}, {dva_word = {0x0,
0x0}}, {dva_word = {0x0, 0x0}}}, blk_prop = 0x8000020200010001, blk_pad =
    0x0}, blk_phys_birth = 0x0, blk_birth = 0x4, blk_fill = 0x0, blk_cksum
= {zc_word = {0x0, 0x0, 0x0, 0x0}}}

Some of the notes on how I got in this state from the forum post (

I had two 2TB drives in mirror configuration for the last 7 years upgrading
FreeBSD and zfs as time went by.  I finally needed more storage and tried
to add two 4TB drives as a second vdev mirror:
'zpool add storage mirror /dev/ada3 /dev/ada4'

Next the 'zpool status' showed:
  pool: storage
state: ONLINE
status: One or more devices are configured to use a non-native block size.
        Expect reduced performance.
action: Replace affected devices with devices that support the
        configured block size, or migrate data to a properly configured
  scan: scrub repaired 0 in 0 days 08:39:28 with 0 errors on Sat May  9
01:19:54 2020

        NAME                             STATE     READ WRITE CKSUM
        storage                          ONLINE       0     0     0
          mirror-0                       ONLINE       0     0     0
            ada1                         ONLINE       0     0     0  block
size: 512B configured, 4096B native
            ada2                         ONLINE       0     0     0  block
size: 512B configured, 4096B native
          mirror-1                       ONLINE       0     0     0
            ada3                         ONLINE       0     0     0
            ada4                         ONLINE       0     0     0

errors: No known data errors

I should have stopped there, but saw the block size warning and thought I
would try to fix it.  The zdb showed mirror-0 with ashift 9 (512 byte
alignment) and mirror-1 with ashift 12 (4096 byte alignment).  I issued
'zpool remove storage mirror-0' and quickly went into a panic reboot loop.
Rebooting into single user mode, first zfs or zpool command loads the
driver and it panics again.  Powering off the new drives and rebooting it
does not panic, but it fails to because the zpool is missing a top-level
vdev (mirror-1).

