Re: Fatal trap 12: .. cpu_idle_acpi .. callout_process

In reply to: Ryan Libby : "Re: Fatal trap 12: .. cpu_idle_acpi .. callout_process"
Go to: [ bottom of page ] [ top of archives ] [ this month ]
From: Bjoern A. Zeeb <bzeeb-lists_at_lists.zabbadoz.net>
Date: Sat, 06 Jun 2026 13:08:09 UTC
On Fri, 5 Jun 2026, Ryan Libby wrote:

> On Tue, Jun 2, 2026 at 9:25 PM Bjoern A. Zeeb
> <bzeeb-lists@lists.zabbadoz.net> wrote:
>>
>> On Wed, 27 May 2026, Bjoern A. Zeeb wrote:
>>
>>> On Tue, 26 May 2026, Bjoern A. Zeeb wrote:
>>>
>>>> Hi,
>>>>
>>>> I got some LinuxKPI problems sorted and can finally shutdown a system w/o
>>>> a driver panicing but now I see on a recent main (pxe booted in bhyve);
>>>> this seems reproducible and typing reset I get the next panic and the next
>>>> and the next and ... until bhyve stops after scrolling for a few seconds.
>>>>
>>>> Anyone seen this or any ideas?  I'll try to build a plain main kernel
>>>> otherwise
>>>> to check that it's not anything else...
>>>
>>> I have already found the next LinuxKPI bug.
>>>
>>> If I just boot a kernel and do a shutdown -r I do not run into it
>>> so unless it rings a bell for someone else as well, please ignore this for
>>> now.
>>
>> It just happened again;  no known LinuxKPI bugs in the way this time.
>>
>> So maybe it's real after all...
>>
>>
>>>> Syncing disks, vnodes remaining... 0 done
>>>> All buffers synced.
>>>> Uptime: 46s
>>>> kernel trap 12 with interrupts disabled
>>>>
>>>>
>>>> Fatal trap 12: page fault while in kernel mode
>>>> cpuid = 0; apic id = 00
>>>> fault virtual address   = 0xfffffe00a58a0630
>>>> fault code              = supervisor read data, page not present
>>>> instruction pointer     = 0x20:0xffffffff80c0ebe8
>>>> stack pointer           = 0x28:0xfffffe008bc49bb0
>>>> frame pointer           = 0x28:0xfffffe008bc49c20
>>>> code segment            = base 0x0, limit 0xfffff, type 0x1b
>>>>                        = DPL 0, pres 1, long 1, def32 0, gran 1
>>>> processor eflags        = resume, IOPL = 0
>>>> current process         = 11 (idle: cpu0)
>>>> rdi: 0000000000002f2c rsi: 0000000000008000 rdx: 0000000000002e2d
>>>> rcx: 0000000000002e2c  r8: fffffe00a58a0630  r9: 000000007fff2744
>>>> rax: fffffe000ef4e000 rbx: 0000000000002e2c rbp: fffffe008bc49c20
>>>> r10: 00000000000003e7 r11: 000000000000044c r12: 0000002f2d000000
>>>> r13: 0000002f2d000000 r14: 0000002e2dd1597a r15: ffffffff82b28300
>>>> trap number             = 12
>>>> panic: page fault
>>>> cpuid = 0
>>>> time = 1779819492
>>>> KDB: stack backtrace:
>>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x36/frame
>>>> 0xfffffe008bc498e0
>>>> vpanic() at vpanic+0x149/frame 0xfffffe008bc49a10
>>>> panic() at panic+0x43/frame 0xfffffe008bc49a70
>>>> trap_pfault() at trap_pfault+0x449/frame 0xfffffe008bc49ae0
>>>> calltrap() at calltrap+0x8/frame 0xfffffe008bc49ae0
>>>> --- trap 0xc, rip = 0xffffffff80c0ebe8, rsp = 0xfffffe008bc49bb0, rbp =
>>>> 0xfffffe008bc49c20 ---
>>>> callout_process() at callout_process+0x138/frame 0xfffffe008bc49c20
>>>> handleevents() at handleevents+0x19a/frame 0xfffffe008bc49c60
>>>> timercb() at timercb+0x19e/frame 0xfffffe008bc49cc0
>>>> lapic_handle_timer() at lapic_handle_timer+0xa4/frame 0xfffffe008bc49cf0
>>>> Xtimerint() at Xtimerint+0xb1/frame 0xfffffe008bc49cf0
>>>> --- interrupt, rip = 0xffffffff810b1104, rsp = 0xfffffe008bc49dc0, rbp =
>>>> 0xfffffe008bc49dd0 ---
>>>> cpu_idle_acpi() at cpu_idle_acpi+0x54/frame 0xfffffe008bc49dd0
>>>> cpu_idle() at cpu_idle+0xa6/frame 0xfffffe008bc49df0
>>>> sched_ule_idletd() at sched_ule_idletd+0x524/frame 0xfffffe008bc49ef0
>>>> fork_exit() at fork_exit+0x82/frame 0xfffffe008bc49f30
>>>> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe008bc49f30
>>>> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
>>>> KDB: enter: panic
>>>> [ thread pid 11 tid 100003 ]
>>>> Stopped at      kdb_enter+0x33: movq    $0,0x15be0c2(%rip)
>>>> db> reset
>>>> panic: mtx_lock_spin: recursed on non-recursive mutex callout @
>>>> /usr/src/sys/kern/kern_timeout.c:576
>>>>
>>>> cpuid = 0
>>>> time = 1779819492
>>>> KDB: stack backtrace:
>>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x36/frame
>>>> 0xfffffe008bc49160
>>>> vpanic() at vpanic+0x149/frame 0xfffffe008bc49290
>>>> panic() at panic+0x43/frame 0xfffffe008bc492f0
>>>> __mtx_lock_spin_flags() at __mtx_lock_spin_flags+0x11b/frame
>>>> 0xfffffe008bc49330
>>>> _callout_stop_safe() at _callout_stop_safe+0x106/frame 0xfffffe008bc493a0
>>>> shutdown_resettodr() at shutdown_resettodr+0x15/frame 0xfffffe008bc493b0
>>>> kern_reboot() at kern_reboot+0x2a3/frame 0xfffffe008bc493f0
>>>> db_reset() at db_reset+0x108/frame 0xfffffe008bc49420
>>>> db_command() at db_command+0x3aa/frame 0xfffffe008bc494e0
>>>> db_command_loop() at db_command_loop+0x4d/frame 0xfffffe008bc494f0
>>>> db_trap() at db_trap+0x100/frame 0xfffffe008bc49590
>>>> kdb_trap() at kdb_trap+0x25f/frame 0xfffffe008bc496e0
>>>> trap() at trap+0x888/frame 0xfffffe008bc49810
>>>> calltrap() at calltrap+0x8/frame 0xfffffe008bc49810
>>>> --- trap 0x3, rip = 0xffffffff80c44f43, rsp = 0xfffffe008bc498e8, rbp =
>>>> 0xfffffe008bc49a10 ---
>>>> kdb_enter() at kdb_enter+0x33/frame 0xfffffe008bc49a10
>>>> panic() at panic+0x43/frame 0xfffffe008bc49a70
>>>> trap_pfault() at trap_pfault+0x449/frame 0xfffffe008bc49ae0
>>>> calltrap() at calltrap+0x8/frame 0xfffffe008bc49ae0
>>>> --- trap 0xc, rip = 0xffffffff80c0ebe8, rsp = 0xfffffe008bc49bb0, rbp =
>>>> 0xfffffe008bc49c20 ---
>>>> callout_process() at callout_process+0x138/frame 0xfffffe008bc49c20
>>>> handleevents() at handleevents+0x19a/frame 0xfffffe008bc49c60
>>>> timercb() at timercb+0x19e/frame 0xfffffe008bc49cc0
>>>> lapic_handle_timer() at lapic_handle_timer+0xa4/frame 0xfffffe008bc49cf0
>>>> Xtimerint() at Xtimerint+0xb1/frame 0xfffffe008bc49cf0
>>>> --- interrupt, rip = 0xffffffff810b1104, rsp = 0xfffffe008bc49dc0, rbp =
>>>> 0xfffffe008bc49dd0 ---
>>>> cpu_idle_acpi() at cpu_idle_acpi+0x54/frame 0xfffffe008bc49dd0
>>>> cpu_idle() at cpu_idle+0xa6/frame 0xfffffe008bc49df0
>>>> sched_ule_idletd() at sched_ule_idletd+0x524/frame 0xfffffe008bc49ef0
>>>> fork_exit() at fork_exit+0x82/frame 0xfffffe008bc49f30
>>>> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe008bc49f30
>>>> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
>>>> panic: mtx_lock_spin: recursed on non-recursive mutex callout @
>>>> /usr/src/sys/kern/kern_timeout.c:576
>>>>
>>>> cpuid = 0
>>>> time = 1779819492
>>>> ..
>>>> ..
>>>> ..
>>>>
>>>>
>>>>
>>>
>>>
>>
>> --
>> Bjoern A. Zeeb                                                     r15:7
>>
>
> Can you resolve this?
>> callout_process() at callout_process+0x138
>
> Just guessing from my local kernel, that may be the first touch of a
> callout in the LIST_FOREACH_SAFE loop of callout_process.  If so that
> may suggest a use after free of some callout, with a dangling pointer
> to the callout remaining in the list.  Maybe someone freed some
> callout without stopping it.  Or maybe the list is corrupt in some
> other way.

I can try the next time I have a clean kernel to check against.  I've since
built multiple I am am afraid.

Also it seems I had seen this before and filed an unnoticed bug:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=291294

-- 
Bjoern A. Zeeb                                                     r15:7