[Bug 253272] Page fault in _mca_init during boot

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Fri Feb 5 16:23:11 UTC 2021


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=253272

            Bug ID: 253272
           Summary: Page fault in _mca_init during boot
           Product: Base System
           Version: 12.2-RELEASE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Many People
          Priority: ---
         Component: kern
          Assignee: bugs at FreeBSD.org
          Reporter: asomers at FreeBSD.org

I saw the following panic during boot on a system running something close to
12.2-RELEASE. It doesn't happen every time.  However, I suspect I've hit the
same bug a few other times and not known, because the kernel normally reboots
immediately since swap is not configured by this point.

Fatal trap 12: page fault while in kernel mode
cpuid = 26; apic id = 34
fault virtual address = 0xd0
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff8125a009
stack pointer = 0x28:0xfffffe0000b65f20
frame pointer = 0x28:0xfffffe0000b65f50
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = resume, IOPL = 0
current process = 11 (idle: cpu26)
trap number = 12
panic: page fault
cpuid = 26
time = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0000b65be0
vpanic() at vpanic+0x17b/frame 0xfffffe0000b65c30
panic() at panic+0x43/frame 0xfffffe0000b65c90
trap_fatal() at trap_fatal+0x391/frame 0xfffffe0000b65cf0
trap_pfault() at trap_pfault+0x4f/frame 0xfffffe0000b65d40
trap() at trap+0x286/frame 0xfffffe0000b65e50
calltrap() at calltrap+0x8/frame 0xfffffe0000b65e50
--- trap 0xc, rip = 0xffffffff8125a009, rsp = 0xfffffe0000b65f20, rbp =
0xfffffe0000b65f50 ---
_mca_init() at _mca_init+0x5d9/frame 0xfffffe0000b65f50
init_secondary_tail() at init_secondary_tail+0xfd/frame 0xfffffe0000b65f80
init_secondary() at init_secondary+0x2d1/frame 0xfffffe0000b65ff0
KDB: enter: panic
[ thread pid 11 tid 100029 ]
Stopped at kdb_enter+0x37: movq $0,0x12bc1f6(%rip)

The bug is caused because only one of my two CPUs reports support for the
MCG_CMCI_P bit.  On boot, it's random which CPU the kernel queries for support.
 If it queries the wrong one, then it doesn't allocate memory for the cmd
state, but later calls cmci_setup() for the CPU that does support that bit. 
The following command shows the asymmetry between the CPUs:

$ for x in $(jot $(sysctl -n hw.ncpu) 0) ; do sudo cpucontrol -m 0x179
/dev/cpuctl$x; done | uniq -c
16 MSR 0x179: 0x00000000 0x0f000c14
16 MSR 0x179: 0x00000000 0x0f000814

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list