[Bug 267009] OpenZFS: panic: VERIFY3(0 == nvlist_lookup_uint64(nvl, name, &rv)) failed (0 == 22)

From: <bugzilla-noreply_at_freebsd.org>
Date: Thu, 13 Oct 2022 10:17:11 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=267009

            Bug ID: 267009
           Summary: OpenZFS: panic: VERIFY3(0 == nvlist_lookup_uint64(nvl,
                    name, &rv)) failed (0 == 22)
           Product: Base System
           Version: 13.1-STABLE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: Trond.Endrestol@ximalas.info

My zpools looked very much like this to begin with:

NAME                            SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG   
CAP  DEDUP    HEALTH  ALTROOT
builder01_zroot                 164G   117G  47.2G        -         -    14%   
71%  1.00x    ONLINE  -
  raidz1-0                      164G   117G  47.2G        -         -    14% 
71.2%      -    ONLINE
    gpt/builder01_zroot0           -      -      -        -         -      -   
  -      -    ONLINE
    gpt/builder01_zroot1           -      -      -        -         -      -   
  -      -    ONLINE
    gpt/builder01_zroot2           -      -      -        -         -      -   
  -      -    ONLINE
logs                               -      -      -        -         -      -   
  -      -  -
  mirror-1                     16.5G   204K  16.5G        -         -     0% 
0.00%      -    ONLINE
    gpt/builder01_zroot_zlog0      -      -      -        -         -      -   
  -      -    ONLINE
    gpt/builder01_zroot_zlog1      -      -      -        -         -      -   
  -      -    ONLINE
builder01_zwork                 374G   237G   137G        -         -    38%   
63%  1.00x    ONLINE  -
  raidz1-0                      374G   237G   137G        -         -    38% 
63.3%      -    ONLINE
    gpt/builder01_zwork0           -      -      -        -         -      -   
  -      -    ONLINE
    gpt/builder01_zwork1           -      -      -        -         -      -   
  -      -    ONLINE
    gpt/builder01_zwork2           -      -      -        -         -      -   
  -      -    ONLINE
logs                               -      -      -        -         -      -   
  -      -  -
  mirror-1                     16.5G     0K  16.5G        -         -     0% 
0.00%      -    ONLINE
    gpt/builder01_zwork_zlog0      -      -      -        -         -      -   
  -      -    ONLINE
    gpt/builder01_zwork_zlog1      -      -      -        -         -      -   
  -      -    ONLINE

I wanted to remove the mirrored slogs, resize and re-add them.
They were originally almost 17 GiB by an oversight, and I wanted them to be
16.0 GiB.
I don't know if I should have run zpool labelclear on the slog partitions
before resizing, I didn't.

zpool remove builder01_zroot mirror-1
zpool remove builder01_zwork mirror-1

gpart resize -i 1 -s 16G xbd4
gpart resize -i 1 -s 16G xbd5
gpart resize -i 1 -s 16G xbd9
gpart resize -i 1 -s 16G xbd10

zpool add builder01_zroot log mirror gpt/builder01_zroot_zlog0
gpt/builder01_zroot_zlog1
zpool add builder01_zwork log mirror gpt/builder01_zwork_zlog0
gpt/builder01_zwork_zlog1

The listing looked very much like this:

NAME                            SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG   
CAP  DEDUP    HEALTH  ALTROOT
builder01_zroot                 164G   117G  47.2G        -         -    14%   
71%  1.00x    ONLINE  -
  raidz1-0                      164G   117G  47.2G        -         -    14% 
71.2%      -    ONLINE
    gpt/builder01_zroot0           -      -      -        -         -      -   
  -      -    ONLINE
    gpt/builder01_zroot1           -      -      -        -         -      -   
  -      -    ONLINE
    gpt/builder01_zroot2           -      -      -        -         -      -   
  -      -    ONLINE
logs                               -      -      -        -         -      -   
  -      -  -
  mirror-2                     15.5G    68K  15.5G        -         -     0% 
0.00%      -    ONLINE
    gpt/builder01_zroot_zlog0      -      -      -        -         -      -   
  -      -    ONLINE
    gpt/builder01_zroot_zlog1      -      -      -        -         -      -   
  -      -    ONLINE
builder01_zwork                 374G   237G   137G        -         -    38%   
63%  1.00x    ONLINE  -
  raidz1-0                      374G   237G   137G        -         -    38% 
63.3%      -    ONLINE
    gpt/builder01_zwork0           -      -      -        -         -      -   
  -      -    ONLINE
    gpt/builder01_zwork1           -      -      -        -         -      -   
  -      -    ONLINE
    gpt/builder01_zwork2           -      -      -        -         -      -   
  -      -    ONLINE
logs                               -      -      -        -         -      -   
  -      -  -
  mirror-2                     15.5G     0K  15.5G        -         -     0% 
0.00%      -    ONLINE
    gpt/builder01_zwork_zlog0      -      -      -        -         -      -   
  -      -    ONLINE
    gpt/builder01_zwork_zlog1      -      -      -        -         -      -   
  -      -    ONLINE

I noticed mirror-1 became mirror-2 for both pools. I expected the pairs would
be named mirror-1.

Upon reboot, I got the panic below.

I booted from a DVD, removed the mirrored slogs from both pools, and the system
could again boot from the root pool.
I re-added the mirrored slogs to the work pool while the system was running.

The listing now looks like this:

NAME                            SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG   
CAP  DEDUP    HEALTH  ALTROOT
builder01_zroot                 164G   117G  47.2G        -         -    14%   
71%  1.00x    ONLINE  -
  raidz1-0                      164G   117G  47.2G        -         -    14% 
71.2%      -    ONLINE
    gpt/builder01_zroot0           -      -      -        -         -      -   
  -      -    ONLINE
    gpt/builder01_zroot1           -      -      -        -         -      -   
  -      -    ONLINE
    gpt/builder01_zroot2           -      -      -        -         -      -   
  -      -    ONLINE
builder01_zwork                 374G   237G   137G        -         -    38%   
63%  1.00x    ONLINE  -
  raidz1-0                      374G   237G   137G        -         -    38% 
63.3%      -    ONLINE
    gpt/builder01_zwork0           -      -      -        -         -      -   
  -      -    ONLINE
    gpt/builder01_zwork1           -      -      -        -         -      -   
  -      -    ONLINE
    gpt/builder01_zwork2           -      -      -        -         -      -   
  -      -    ONLINE
logs                               -      -      -        -         -      -   
  -      -  -
  mirror-3                     15.5G     0K  15.5G        -         -     0% 
0.00%      -    ONLINE
    gpt/builder01_zwork_zlog0      -      -      -        -         -      -   
  -      -    ONLINE
    gpt/builder01_zwork_zlog1      -      -      -        -         -      -   
  -      -    ONLINE

Maybe the recent OpenZFS commits fixes this issue. If not, maybe the test suite
should be extended to cover the kernel's ability to mount a root pool where the
vdevs are numbered non-contiguously, if this is what triggers the panic.

Note, local branch commit 7806d3b0243f... as indicated in the BE's name,
corresponds to src/stable/13 commit 3ea8c7ad90f75129c52a2b64213c5578af23dc8d,
dated Tue Aug 9 15:47:40 2022 -0400.

Here's the panic message, screenshotted, OCR-ed, and edited by hand:

Trying to mount root from
zfs:builder01_zroot/ROOT/20220810-190437-stable-13-local-n252030-7806d3b0243f
[] ...
cd0 at ata1 bus 0 scbus1 target 1 lun 0
cd0: <QEMU QEMU DVD-ROM 2.5+> Removable CD-ROM SCSI device
cd0: Serial Number QM00004
cd0: 16.700MB/s transfers (WDMA2, ATAPI 12bytes, PIO 65534bytes)
cd0: Attempt to query device size failed: NOT READY, Medium not present
panic: VERIFY3(0 == nvlist_lookup_uint64(nvl, name, &rv)) failed (0 == 22)

cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper() at 0xffffffff805b804b =
db_trace_self_urapper+0x2b/frame 0xfffffe009cdec580
vpanic() at 0xffffffff80806fb1 = vpanic+0x151/frame 0xfffffe009cdecSd8
spl_panic() at 0xffffffff8036391a = spl_panic+0x3a/frame 0xfffffe009cdec630
fnvlist_lookup_uint64() at 0xffffffff80385ef3 =
fnvlist_lookup_uint64+0x43/frame 0xfffffe009cdec650
spa_import_rootpool() at 0xffffffff8038d10e = spa_import_rootpool+0x5e/frame
0xfffffe009cdec6c0
zfs_mount() at 0xffffffff8039aaaf = zfs_mount+0x41f/frame 0xfffffe009cdec850
vfs_domount_first() at 0xffffffff808e1f03 = vfs_domount_first+0x213/frame
0xfffffe009cdec980
vfs_domount() at 0xffffffff808de855 = vfs_domount+0x2b5/frame
0xfffffe009cdecab0
vfs_donmount() at 0xffffffff808ddd85 = vfs_donmount+0x8d5/frame
0xfffffe009cdecb50
kernel_mount() at 0xffffffff808e100d = kernel_mount+0x3d/frame
0xfffffe009cdecba0
parse_mount() at 0xffffffff808e5364 = parse_mount+0x4d4/frame
0xfffffe009cdecce0
vfs_mountroot() at 0xffffffff808e37b3 = vfs_mountroot+0x763/frame
0xfffffe009cdece50
start_init() at 0xffffffff807932c3 = start_init+0x23/frame 0xfffffe009cdecef0
fork_exit() at 0xffffffff807c2a9e = fork_exit+0x7e/trame 0xfffffe009cdecf30
fork_trampoline() at 0xffffffff80baf89e = fork_trampoline+0xe/frame
0xfffffe009cdecf30
--- trap 0x9ce4aa98, rip= 0xffffffff8079288f, rsp = 0, rbp = 0x20014 ---
mi_startup() at 0xffffffff8079200f = mi_startup+0xdf/frame 0x20014
Uptime: ls
Automatic reboot in 15 seconds - press a key on the console to abort
--> Press a key on the console to reboot,
--> or switch off the system now.

-- 
You are receiving this mail because:
You are the assignee for the bug.