[Bug 295609] GEOM, maybe g_cache, panic on 16-CURRENT during heavy load

From: <bugzilla-noreply_at_freebsd.org>
Date: Tue, 26 May 2026 13:20:54 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=295609

            Bug ID: 295609
           Summary: GEOM, maybe g_cache, panic on 16-CURRENT during heavy
                    load
           Product: Base System
           Version: 16.0-CURRENT
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: agh@riseup.net

Have been getting kernel panics when poudriere-bulk is running. The host is an
AMD EYPC 7742, 64 core with 125GiB RAM. Poudriere is configured with
USE_TMPFS=all, the disk storage is a ufs+gjournal+geli on a gstripe of 8
gmirror pairs, each disk is labeled with gcache, for a total of 16 disks, and
24TiB of space. Each gcache label was configured with 128 elements at 1MiB. I
have noticed recently that the host eventually (either minutes or hours)
responds poorly to input within a tmux session, and all the NFS clients will
often stall for minutes reading or writing files, this file IO delay is also
observed on the host.

> Fatal trap 9: general protection fault while in kernel mode
> cpuid = 11; apic id = 0b
> instruction pointer     = 0x20:0xffffffff806187d0
> stack pointer           = 0:0xfffffe01fb9b7e20
> frame pointer           = 0:0xfffffe01fb9b7e60
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 14 (g_down)
> rdi: fffff8011ba60180 rsi: fffff80e0f318cc0 rdx: 6972506e6f697461
> rcx: 0000000000000007  r8: 0000000000000000  r9: fffff8011ba1bc00
> rax: 00000000002d5cd7 rbx: fffff80e0f318cc0 rbp: fffffe01fb9b7e60
> r10: 0000000000009000 r11: 0000000000000200 r12: fffff80e0f318cc0
> r13: fffff8010e9b3780 r14: fffff8011ba601b8 r15: fffff8011ba60180
> trap number             = 9
> panic: general protection fault
> cpuid = 11
> time = 1779798903
> Uptime: 3h38m43s
> Dumping 9081 out of 130902 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
> 
> Reading symbols from /boot/kernel.FAFNIR/linux.ko...
> Reading symbols from /usr/lib/debug//boot/kernel.FAFNIR/linux.ko.debug...
> Reading symbols from /boot/kernel.FAFNIR/linux_common.ko...
> Reading symbols from /usr/lib/debug//boot/kernel.FAFNIR/linux_common.ko.debug...
> Reading symbols from /boot/kernel.FAFNIR/vmm.ko...
> Reading symbols from /usr/lib/debug//boot/kernel.FAFNIR/vmm.ko.debug...
> Reading symbols from /boot/kernel.FAFNIR/geom_journal.ko...
> Reading symbols from /usr/lib/debug//boot/kernel.FAFNIR/geom_journal.ko.debug...
> Reading symbols from /boot/kernel.FAFNIR/linux64.ko...
> Reading symbols from /usr/lib/debug//boot/kernel.FAFNIR/linux64.ko.debug...
> Reading symbols from /boot/kernel.FAFNIR/linprocfs.ko...
> Reading symbols from /usr/lib/debug//boot/kernel.FAFNIR/linprocfs.ko.debug...
> __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
> warning: 57     /usr/src/sys/amd64/include/pcpu_aux.h: No such file or directory
> (kgdb) where
> #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
> #1  doadump (textdump=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:399
> #2  0xffffffff806dbdb4 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:519
> #3  0xffffffff806dc275 in vpanic (fmt=0xffffffff80bca28a "%s", ap=ap@entry=0xfffffe01fb9b7be0) at /usr/src/sys/kern/kern_shutdown.c:974
> #4  0xffffffff806dc113 in panic (fmt=<unavailable>) at /usr/src/sys/kern/kern_shutdown.c:887
> #5  0xffffffff80aa7638 in trap_fatal (frame=<optimized out>, eva=<optimized out>) at /usr/src/sys/amd64/amd64/trap.c:1028
> #6  0xffffffff80aa7527 in trap (frame=0xfffffe01fb9b7d60) at /usr/src/sys/amd64/amd64/trap.c:684
> #7  <signal handler called>
> #8  g_cache_lookup (sc=0xfffff8011ba60180, bno=2972887) at /usr/src/sys/geom/cache/g_cache.c:254
> #9  g_cache_read (sc=sc@entry=0xfffff8011ba60180, bp=bp@entry=0xfffff80e0f318cc0) at /usr/src/sys/geom/cache/g_cache.c:266
> #10 0xffffffff80618046 in g_cache_start (bp=0xfffff80e0f318cc0) at /usr/src/sys/geom/cache/g_cache.c:363
> #11 0xffffffff80630507 in g_io_schedule_down (tp=<optimized out>) at /usr/src/sys/geom/geom_io.c:849
> #12 0xffffffff80630edc in g_down_procbody (arg=<optimized out>) at /usr/src/sys/geom/geom_kern.c:108
> #13 0xffffffff80697663 in fork_exit (callout=0xffffffff80630e80 <g_down_procbody>, arg=0x0, frame=0xfffffe01fb9b7f40) at /usr/src/sys/kern/kern_fork.c:1201
> #14 <signal handler called>

-- 
You are receiving this mail because:
You are the assignee for the bug.