Re: Fatal trap 12: .. cpu_idle_acpi .. callout_process
- In reply to: Ryan Libby : "Re: Fatal trap 12: .. cpu_idle_acpi .. callout_process"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sat, 06 Jun 2026 13:08:09 UTC
On Fri, 5 Jun 2026, Ryan Libby wrote: > On Tue, Jun 2, 2026 at 9:25 PM Bjoern A. Zeeb > <bzeeb-lists@lists.zabbadoz.net> wrote: >> >> On Wed, 27 May 2026, Bjoern A. Zeeb wrote: >> >>> On Tue, 26 May 2026, Bjoern A. Zeeb wrote: >>> >>>> Hi, >>>> >>>> I got some LinuxKPI problems sorted and can finally shutdown a system w/o >>>> a driver panicing but now I see on a recent main (pxe booted in bhyve); >>>> this seems reproducible and typing reset I get the next panic and the next >>>> and the next and ... until bhyve stops after scrolling for a few seconds. >>>> >>>> Anyone seen this or any ideas? I'll try to build a plain main kernel >>>> otherwise >>>> to check that it's not anything else... >>> >>> I have already found the next LinuxKPI bug. >>> >>> If I just boot a kernel and do a shutdown -r I do not run into it >>> so unless it rings a bell for someone else as well, please ignore this for >>> now. >> >> It just happened again; no known LinuxKPI bugs in the way this time. >> >> So maybe it's real after all... >> >> >>>> Syncing disks, vnodes remaining... 0 done >>>> All buffers synced. >>>> Uptime: 46s >>>> kernel trap 12 with interrupts disabled >>>> >>>> >>>> Fatal trap 12: page fault while in kernel mode >>>> cpuid = 0; apic id = 00 >>>> fault virtual address = 0xfffffe00a58a0630 >>>> fault code = supervisor read data, page not present >>>> instruction pointer = 0x20:0xffffffff80c0ebe8 >>>> stack pointer = 0x28:0xfffffe008bc49bb0 >>>> frame pointer = 0x28:0xfffffe008bc49c20 >>>> code segment = base 0x0, limit 0xfffff, type 0x1b >>>> = DPL 0, pres 1, long 1, def32 0, gran 1 >>>> processor eflags = resume, IOPL = 0 >>>> current process = 11 (idle: cpu0) >>>> rdi: 0000000000002f2c rsi: 0000000000008000 rdx: 0000000000002e2d >>>> rcx: 0000000000002e2c r8: fffffe00a58a0630 r9: 000000007fff2744 >>>> rax: fffffe000ef4e000 rbx: 0000000000002e2c rbp: fffffe008bc49c20 >>>> r10: 00000000000003e7 r11: 000000000000044c r12: 0000002f2d000000 >>>> r13: 0000002f2d000000 r14: 0000002e2dd1597a r15: ffffffff82b28300 >>>> trap number = 12 >>>> panic: page fault >>>> cpuid = 0 >>>> time = 1779819492 >>>> KDB: stack backtrace: >>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x36/frame >>>> 0xfffffe008bc498e0 >>>> vpanic() at vpanic+0x149/frame 0xfffffe008bc49a10 >>>> panic() at panic+0x43/frame 0xfffffe008bc49a70 >>>> trap_pfault() at trap_pfault+0x449/frame 0xfffffe008bc49ae0 >>>> calltrap() at calltrap+0x8/frame 0xfffffe008bc49ae0 >>>> --- trap 0xc, rip = 0xffffffff80c0ebe8, rsp = 0xfffffe008bc49bb0, rbp = >>>> 0xfffffe008bc49c20 --- >>>> callout_process() at callout_process+0x138/frame 0xfffffe008bc49c20 >>>> handleevents() at handleevents+0x19a/frame 0xfffffe008bc49c60 >>>> timercb() at timercb+0x19e/frame 0xfffffe008bc49cc0 >>>> lapic_handle_timer() at lapic_handle_timer+0xa4/frame 0xfffffe008bc49cf0 >>>> Xtimerint() at Xtimerint+0xb1/frame 0xfffffe008bc49cf0 >>>> --- interrupt, rip = 0xffffffff810b1104, rsp = 0xfffffe008bc49dc0, rbp = >>>> 0xfffffe008bc49dd0 --- >>>> cpu_idle_acpi() at cpu_idle_acpi+0x54/frame 0xfffffe008bc49dd0 >>>> cpu_idle() at cpu_idle+0xa6/frame 0xfffffe008bc49df0 >>>> sched_ule_idletd() at sched_ule_idletd+0x524/frame 0xfffffe008bc49ef0 >>>> fork_exit() at fork_exit+0x82/frame 0xfffffe008bc49f30 >>>> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe008bc49f30 >>>> --- trap 0, rip = 0, rsp = 0, rbp = 0 --- >>>> KDB: enter: panic >>>> [ thread pid 11 tid 100003 ] >>>> Stopped at kdb_enter+0x33: movq $0,0x15be0c2(%rip) >>>> db> reset >>>> panic: mtx_lock_spin: recursed on non-recursive mutex callout @ >>>> /usr/src/sys/kern/kern_timeout.c:576 >>>> >>>> cpuid = 0 >>>> time = 1779819492 >>>> KDB: stack backtrace: >>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x36/frame >>>> 0xfffffe008bc49160 >>>> vpanic() at vpanic+0x149/frame 0xfffffe008bc49290 >>>> panic() at panic+0x43/frame 0xfffffe008bc492f0 >>>> __mtx_lock_spin_flags() at __mtx_lock_spin_flags+0x11b/frame >>>> 0xfffffe008bc49330 >>>> _callout_stop_safe() at _callout_stop_safe+0x106/frame 0xfffffe008bc493a0 >>>> shutdown_resettodr() at shutdown_resettodr+0x15/frame 0xfffffe008bc493b0 >>>> kern_reboot() at kern_reboot+0x2a3/frame 0xfffffe008bc493f0 >>>> db_reset() at db_reset+0x108/frame 0xfffffe008bc49420 >>>> db_command() at db_command+0x3aa/frame 0xfffffe008bc494e0 >>>> db_command_loop() at db_command_loop+0x4d/frame 0xfffffe008bc494f0 >>>> db_trap() at db_trap+0x100/frame 0xfffffe008bc49590 >>>> kdb_trap() at kdb_trap+0x25f/frame 0xfffffe008bc496e0 >>>> trap() at trap+0x888/frame 0xfffffe008bc49810 >>>> calltrap() at calltrap+0x8/frame 0xfffffe008bc49810 >>>> --- trap 0x3, rip = 0xffffffff80c44f43, rsp = 0xfffffe008bc498e8, rbp = >>>> 0xfffffe008bc49a10 --- >>>> kdb_enter() at kdb_enter+0x33/frame 0xfffffe008bc49a10 >>>> panic() at panic+0x43/frame 0xfffffe008bc49a70 >>>> trap_pfault() at trap_pfault+0x449/frame 0xfffffe008bc49ae0 >>>> calltrap() at calltrap+0x8/frame 0xfffffe008bc49ae0 >>>> --- trap 0xc, rip = 0xffffffff80c0ebe8, rsp = 0xfffffe008bc49bb0, rbp = >>>> 0xfffffe008bc49c20 --- >>>> callout_process() at callout_process+0x138/frame 0xfffffe008bc49c20 >>>> handleevents() at handleevents+0x19a/frame 0xfffffe008bc49c60 >>>> timercb() at timercb+0x19e/frame 0xfffffe008bc49cc0 >>>> lapic_handle_timer() at lapic_handle_timer+0xa4/frame 0xfffffe008bc49cf0 >>>> Xtimerint() at Xtimerint+0xb1/frame 0xfffffe008bc49cf0 >>>> --- interrupt, rip = 0xffffffff810b1104, rsp = 0xfffffe008bc49dc0, rbp = >>>> 0xfffffe008bc49dd0 --- >>>> cpu_idle_acpi() at cpu_idle_acpi+0x54/frame 0xfffffe008bc49dd0 >>>> cpu_idle() at cpu_idle+0xa6/frame 0xfffffe008bc49df0 >>>> sched_ule_idletd() at sched_ule_idletd+0x524/frame 0xfffffe008bc49ef0 >>>> fork_exit() at fork_exit+0x82/frame 0xfffffe008bc49f30 >>>> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe008bc49f30 >>>> --- trap 0, rip = 0, rsp = 0, rbp = 0 --- >>>> panic: mtx_lock_spin: recursed on non-recursive mutex callout @ >>>> /usr/src/sys/kern/kern_timeout.c:576 >>>> >>>> cpuid = 0 >>>> time = 1779819492 >>>> .. >>>> .. >>>> .. >>>> >>>> >>>> >>> >>> >> >> -- >> Bjoern A. Zeeb r15:7 >> > > Can you resolve this? >> callout_process() at callout_process+0x138 > > Just guessing from my local kernel, that may be the first touch of a > callout in the LIST_FOREACH_SAFE loop of callout_process. If so that > may suggest a use after free of some callout, with a dangling pointer > to the callout remaining in the list. Maybe someone freed some > callout without stopping it. Or maybe the list is corrupt in some > other way. I can try the next time I have a clean kernel to check against. I've since built multiple I am am afraid. Also it seems I had seen this before and filed an unnoticed bug: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=291294 -- Bjoern A. Zeeb r15:7