Re: Panic: Page Fault in Kernel: Yesterday's CURRENT

From: Michael Butler via freebsd-current <freebsd-current_at_freebsd.org>
Date: Tue, 21 Dec 2021 21:04:49 UTC
Confirmed. The kernel at ..

FreeBSD 14.0-CURRENT #0 f06f1d1fdb9: Mon Dec 20 12:24:51 EST 2021

  .. boots successfully.

The kernel at ..

FreeBSD 14.0-CURRENT #1 553af8f1ec7: Tue Dec 21 15:16:10 EST 2021

  .. fails immediately after printing something like ..

Timecounters tick every 1.000 msec
Timecounter "TSC" frequency 701570048 Hz quality 800

  .. but before initializing ipfw as it used to,

	Michael

On 12/21/21 12:01, Michael Butler via freebsd-current wrote:
> I have an old pentium-3 that also won't boot kernels built after Dec 6th.
> 
> I suspect the commits listed below but, with the device being remote and 
> having no DRAC, I'm struggling to test this theory.
> 
> The relevant commits ..
> 
> commit 553af8f1ec71d397b5b4fd5876622b9269936e63
> Author: Mark Johnston <markj@FreeBSD.org>
> Date:   Mon Dec 6 10:42:19 2021 -0500
> 
>      x86: Perform late TSC calibration before LAPIC timer calibration
> 
> commit 62d09b46ad7508ae74d462e49234f0a80f91ff69
> Author: Mark Johnston <markj@FreeBSD.org>
> Date:   Mon Dec 6 10:42:10 2021 -0500
> 
>      x86: Defer LAPIC calibration until after timecounters are available
> 
> It's currently running git rev e43d081f352 and I have a kernel at git 
> rev f06f1d1fdb969fa7a0a6eefa030d8536f365eb6e to test later this evening,
> 
>      Michael
> 
> 
> On 12/17/21 15:07, Larry Rosenman wrote:
>> On 12/17/2021 1:36 pm, Mark Johnston wrote:
>>> On Fri, Dec 10, 2021 at 10:43:19AM -0600, Larry Rosenman wrote:
>>>> 14-2021_12_07-1217             -      -          1.87G 2021-12-07 12:17
>>>> 14-2021_12_09-1957             NR     /          121G  2021-12-09 19:57
>>>>
>>>> If that's any help
>>>
>>> I can't tell what this is saying.  A kernel built on the 7th does not
>>> crash, or...?  Which revision did you update from before you started
>>> seeing crashes?
>>>
>>> From a kgdb session it'd be useful to see output from
>>>
>>> (kgdb) frame 8
>>> (kgdb) p/x *tmp
>>>
>>> to start.
>>>
>>
>> Correct, the 7th didn't panic, but the 9th did, and yesterday's too.
>>
>> Grrr
>> ler in borg in /mnt🔒 on ☁️  (us-east-1)
>> ❯ kgdb -c /var/crash/vmcore.0  /mnt/boot/kernel/kernel
>> GNU gdb (GDB) 11.1 [GDB v11.1 for FreeBSD]
>> Copyright (C) 2021 Free Software Foundation, Inc.
>> License GPLv3+: GNU GPL version 3 or later 
>> <http://gnu.org/licenses/gpl.html>
>> This is free software: you are free to change and redistribute it.
>> There is NO WARRANTY, to the extent permitted by law.
>> Type "show copying" and "show warranty" for details.
>> This GDB was configured as "x86_64-portbld-freebsd14.0".
>> Type "show configuration" for configuration details.
>> For bug reporting instructions, please see:
>> <https://www.gnu.org/software/gdb/bugs/>.
>> Find the GDB manual and other documentation resources online at:
>>      <http://www.gnu.org/software/gdb/documentation/>.
>>
>> For help, type "help".
>> Type "apropos word" to search for commands related to "word"...
>> Reading symbols from /mnt/boot/kernel/kernel...
>> (No debugging symbols found in /mnt/boot/kernel/kernel)
>> Failed to open vmcore: /var/crash/vmcore.0: Permission denied
>> (kgdb) bt
>> No stack.
>> quitb)
>>
>> ler in borg in /mnt🔒 on ☁️  (us-east-1) took 6s
>> ❯ sudo chmod +r /var/crash/*
>>
>> ler in borg in /mnt🔒 on ☁️  (us-east-1)
>> ❯ kgdb -c /var/crash/vmcore.0  /mnt/boot/kernel/kernel
>> GNU gdb (GDB) 11.1 [GDB v11.1 for FreeBSD]
>> Copyright (C) 2021 Free Software Foundation, Inc.
>> License GPLv3+: GNU GPL version 3 or later 
>> <http://gnu.org/licenses/gpl.html>
>> This is free software: you are free to change and redistribute it.
>> There is NO WARRANTY, to the extent permitted by law.
>> Type "show copying" and "show warranty" for details.
>> This GDB was configured as "x86_64-portbld-freebsd14.0".
>> Type "show configuration" for configuration details.
>> For bug reporting instructions, please see:
>> <https://www.gnu.org/software/gdb/bugs/>.
>> Find the GDB manual and other documentation resources online at:
>>      <http://www.gnu.org/software/gdb/documentation/>.
>>
>> For help, type "help".
>> Type "apropos word" to search for commands related to "word"...
>> Reading symbols from /mnt/boot/kernel/kernel...
>> (No debugging symbols found in /mnt/boot/kernel/kernel)
>> /wrkdirs/usr/ports/devel/gdb/work-py37/gdb-11.1/gdb/thread.c:1345: 
>> internal-error: void switch_to_thread(thread_info *): Assertion `thr 
>> != NULL' failed.
>> A problem internal to GDB has been detected,
>> further debugging may prove unreliable.
>> Quit this debugging session? (y or n) n
>>
>> This is a bug, please report it.  For instructions, see:
>> <https://www.gnu.org/software/gdb/bugs/>.
>>
>> /wrkdirs/usr/ports/devel/gdb/work-py37/gdb-11.1/gdb/thread.c:1345: 
>> internal-error: void switch_to_thread(thread_info *): Assertion `thr 
>> != NULL' failed.
>> A problem internal to GDB has been detected,
>> further debugging may prove unreliable.
>> Create a core file of GDB? (y or n) n
>> Command aborted.
>> (kgdb) bt
>> No thread selected.
>> (kgdb) fr 8
>> No thread selected.
>> (kgdb)
>>
>>>> On 12/10/2021 10:36 am, Alexander Motin wrote:
>>>> > Hi Larry,
>>>> >
>>>> > This looks like some use-after-free or otherwise corrupted callout
>>>> > structure.  Unfortunately the backtrace does not tell what was the
>>>> > callout.  When was the previous update to look what could change?
>>>> >
>>>> > On 10.12.2021 11:24, Larry Rosenman wrote:
>>>> >> FreeBSD borg.lerctr.org 14.0-CURRENT FreeBSD 14.0-CURRENT #15
>>>> >> main-n251537-ab639f2398b: Thu Dec  9 19:45:37 CST 2021
>>>> >> root@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/LER-MINIMAL
>>>> >> amd64
>>>> >>
>>>> >> VMCORE *IS* available.
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> Unread portion of the kernel message buffer:
>>>> >> kernel trap 12 with interrupts disabled
>>>> >>
>>>> >>
>>>> >> Fatal trap 12: page fault while in kernel mode
>>>> >> cpuid = 0; apic id = 20
>>>> >> fault virtual address   = 0x0
>>>> >> fault code              = supervisor write data, page not present
>>>> >> instruction pointer     = 0x20:0xffffffff804e0db4
>>>> >> stack pointer           = 0x0:0xfffffe0434de4e10
>>>> >> frame pointer           = 0x0:0xfffffe0434de4e70
>>>> >> code segment            = base 0x0, limit 0xfffff, type 0x1b
>>>> >>                         = DPL 0, pres 1, long 1, def32 0, gran 1
>>>> >> processor eflags        = resume, IOPL = 0
>>>> >> current process         = 82990 (c++)
>>>> >> trap number             = 12
>>>> >> panic: page fault
>>>> >> cpuid = 0
>>>> >> time = 1639111198
>>>> >> KDB: stack backtrace:
>>>> >> #0 0xffffffff8050fc95 at kdb_backtrace+0x65
>>>> >> #1 0xffffffff804c468f at vpanic+0x17f
>>>> >> #2 0xffffffff804c4503 at panic+0x43
>>>> >> #3 0xffffffff807a2195 at trap_fatal+0x385
>>>> >> #4 0xffffffff807a21ef at trap_pfault+0x4f
>>>> >> #5 0xffffffff80779c78 at calltrap+0x8
>>>> >> #6 0xffffffff8045ddb8 at handleevents+0x188
>>>> >> #7 0xffffffff8045ea3e at timercb+0x24e
>>>> >> #8 0xffffffff807ca9eb at lapic_handle_timer+0x9b
>>>> >> #9 0xffffffff8077b9b1 at Xtimerint+0xb1
>>>> >> Uptime: 2h28m57s
>>>> >> Dumping 12829 out of 131023
>>>> >> MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
>>>> >>
>>>> >> __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
>>>> >> 55              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n"
>>>> >> (offsetof(struct pcpu,
>>>> >> (kgdb) #0  __curthread () at 
>>>> /usr/src/sys/amd64/include/pcpu_aux.h:55
>>>> >> #1  doadump (textdump=<optimized out>)
>>>> >>     at /usr/src/sys/kern/kern_shutdown.c:399
>>>> >> #2  0xffffffff804c428c in kern_reboot (howto=260)
>>>> >>     at /usr/src/sys/kern/kern_shutdown.c:487
>>>> >> #3  0xffffffff804c46fe in vpanic (fmt=0xffffffff807e1276 "%s",
>>>> >>     ap=<optimized out>) at /usr/src/sys/kern/kern_shutdown.c:920
>>>> >> #4  0xffffffff804c4503 in panic (fmt=<unavailable>)
>>>> >>     at /usr/src/sys/kern/kern_shutdown.c:844
>>>> >> #5  0xffffffff807a2195 in trap_fatal (frame=0xfffffe0434de4d50, 
>>>> eva=0)
>>>> >>     at /usr/src/sys/amd64/amd64/trap.c:946
>>>> >> #6  0xffffffff807a21ef in trap_pfault (frame=0xfffffe0434de4d50,
>>>> >>     usermode=false, signo=<optimized out>, ucode=<optimized out>)
>>>> >>     at /usr/src/sys/amd64/amd64/trap.c:765
>>>> >> #7  <signal handler called>
>>>> >> #8  0xffffffff804e0db4 in callout_process
>>>> >> (now=now@entry=38385536922300)
>>>> >>     at /usr/src/sys/kern/kern_timeout.c:488
>>>> >> #9  0xffffffff8045ddb8 in handleevents 
>>>> (now=now@entry=38385536922300,
>>>> >>     fake=fake@entry=0) at /usr/src/sys/kern/kern_clocksource.c:213
>>>> >> #10 0xffffffff8045ea3e in timercb (et=0xffffffff80d475e0 <lapic_et>,
>>>> >>     arg=<optimized out>) at /usr/src/sys/kern/kern_clocksource.c:357
>>>> >> #11 0xffffffff807ca9eb in lapic_handle_timer
>>>> >> (frame=0xfffffe0434de4f40)
>>>> >>     at /usr/src/sys/x86/x86/local_apic.c:1364
>>>> >> #12 <signal handler called>
>>>> >> #13 0x000000080df42bb6 in ?? ()
>>>> >> Backtrace stopped: Cannot access memory at address 0x7ffffdef2c90
>>>> >> (kgdb)
>>
> 
>