[Bug 278414] Reproducible zpool(8) panic with 14.0-RELEASE amd64-zfs.raw VM-IMAGES

From: <bugzilla-noreply_at_freebsd.org>
Date: Wed, 17 Apr 2024 17:24:57 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=278414

            Bug ID: 278414
           Summary: Reproducible zpool(8) panic with 14.0-RELEASE
                    amd64-zfs.raw VM-IMAGES
           Product: Base System
           Version: 14.0-RELEASE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: editor@callfortesting.org

Created attachment 250031
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=250031&action=edit
Script to reproduce the issues

I have been exercising the 14.0-RELEASE amd64-zfs.raw VM-IMAGES produced by
Release Engineering (thank you for these!) and have two reproducible issues
when mirroring two images (thank you for fixing mkimg/makefs to allow this!):

Some runs of the attached reproduction script run flawlessly, which others
report between 4 and 50K checksum errors on the attached device:

        NAME             STATE     READ WRITE CKSUM
        zroot            ONLINE       0     0     0
          mirror-0       ONLINE       0     0     0
            gpt/rootfs1  ONLINE       0     0     0
            gpt/rootfs2  ONLINE       0     0 51.4K

Some runs cause a panic:

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0x10
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff81f54447
stack pointer           = 0x28:0xfffffe016703cce8
frame pointer           = 0x28:0xfffffe016703cd20
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 6 (dmu_objset_find_2)
rdi: fffff8045d553ac8 rsi: fffff80449059b50 rdx: fffff804490598d0
rcx: fffff80449059b50  r8: 0000000000000001  r9: 0000000000000002
rax: 0000000000000001 rbx: 00000000ffffffff rbp: fffffe016703cd20
r10: 0000000000000000 r11: 0000000000000001 r12: 0000000000000003
r13: 0000000000000001 r14: 0000000000000000 r15: 0000000000000046

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0x12
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff84161a20
stack pointer           = 0x28:0xfffffe016703c440
frame pointer           = 0x28:0xfffffe016703c480
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 6 (dmu_objset_find_2)
rdi: 0000000000000012 rsi: fffff80102350828 rdx: fffff80102350810
rcx: fffffe01652553f8  r8: ffffffffffffffda  r9: 0000000000000000
rax: fffffe016703c818 rbx: fffffe01652582b0 rbp: fffffe016703c480
r10: fffffe0165e16a72 r11: fffff80024eac800 r12: fffff8017c2f1900
r13: fffffe0165255000 r14: fffff80024eac800 r15: 0000000000000000

KDB: stack backtrace:
#0 0xffffffff80b9009d at kdb_backtrace+0x5d
#1 0xffffffff80b431a2 at vpanic+0x132
#2 0xffffffff80b43063 at panic+0x43
#3 0xffffffff8100c85c at trap_fatal+0x40c
#4 0xffffffff8100c8af at trap_pfault+0x4f
#5 0xffffffff80fe3ad8 at calltrap+0x8
#6 0xffffffff8411829a at skl_compute_wm+0xa6a
#7 0xffffffff840df49f at intel_atomic_check+0xf0f
#8 0xffffffff83d15783 at drm_atomic_check_only+0x4a3
#9 0xffffffff83d15bc3 at drm_atomic_commit+0x13
#10 0xffffffff83d252c8 at drm_client_modeset_commit_atomic+0x158
#11 0xffffffff83d253b4 at drm_client_modeset_commit_locked+0x74
#12 0xffffffff83d25541 at drm_client_modeset_commit+0x21
#13 0xffffffff83d68303 at drm_fb_helper_restore_fbdev_mode_unlocked+0x83
#14 0xffffffff83d55661 at vt_kms_postswitch+0x181
#15 0xffffffff8098a01f at vt_window_switch+0x11f
#16 0xffffffff8098b45f at vtterm_cngrab+0x4f
#17 0xffffffff80ad7556 at cngrab+0x26

I am attaching the text dump and can provide a core dump, but hopefully the
reproduction script will help you create your very own ones.

The script does not assist with downloading the AMD64 VM-IMAGE. Simply expand
it with unxz.

Caveat: The VM-IMAGES include the zpool name 'zroot' and will conflict with a
host using the same name. I can add rename-on-import syntax if you like.

Let me know what other information might be helpful. Thanks!

-- 
You are receiving this mail because:
You are the assignee for the bug.