[Bug 228056] powerpc64: MCE on POWER9 machine (AC922)
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Tue May 8 00:52:50 UTC 2018
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=228056
Bug ID: 228056
Summary: powerpc64: MCE on POWER9 machine (AC922)
Product: Base System
Version: CURRENT
Hardware: powerpc
OS: Any
Status: New
Severity: Affects Only Me
Priority: ---
Component: kern
Assignee: bugs at FreeBSD.org
Reporter: breno.leitao at gmail.com
I am creating this bug to track my progress on investigating the bootstrap of
FreeBSD on a AC922 (POWER9) machine.
When I boot HEAD, I found the following MCE:
KDB: debugger backends: ddb
KDB: current backend: ddb
Copyright (c) 1992-2018 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 12.0-CURRENT #152 66f063557f2(master)-dirty: Tue May 8 01:17:52 CET
2018
root at free8:/usr/obj/root/kernel/freebsd/powerpc.powerpc64/sys/BRENO powerpc
gcc version 4.2.1 20070831 patched [FreeBSD]
WARNING: WITNESS option enabled, expect reduced performance.
WARNING: DIAGNOSTIC option enabled, expect reduced performance.
Entering uma_startup with 44 boot pages configured
startup_alloc from "UMA Kegs", 41 boot pages left
startup_alloc from "UMA Zones", 40 boot pages left
startup_alloc from "UMA Zones", 38 boot pages left
startup_alloc from "UMA Zones", 36 boot pages left
start at c000000001e30100
KERNEL BASE at 100100
sum is c000000001d30000
fatal kernel trap:
exception = 0x200 (machine check)
srr0 = 0xc00000000255d284 (0x82d284)
srr1 = 0x9000000000201032
current msr = 0x9000000000000032
lr = 0xc00000000255d278 (0x82d278)
curthread = 0xc000000002e2bbc0
pid = 0, comm =
[ thread pid 0 tid 0 ]
Stopped at 0xc00000000255d284
Digging further, this is where it is breaking:
82d264: 7f c3 f3 78 mr r3,r30
82d268: 7e e4 bb 78 mr r4,r23
82d26c: 7f 65 db 78 mr r5,r27
82d270: 7f 86 e3 78 mr r6,r28
82d274: 4b ff f8 49 bl 82cabc <.keg_alloc_slab>
82d278: 7c 7d 1b 79 mr. r29,r3
82d27c: 41 a2 00 94 beq+ 82d310 <.keg_fetch_slab+0x2cc>
82d280: 7f bc eb 78 mr r28,r29
->> 82d284: e8 1d 00 00 ld r0,0(r29)
82d288: 7f a0 f0 00 cmpd cr7,r0,r30
At this place, r29 contains:
db> print $r29
c00003fffffddf90
Looking at that code, I think we are here:
slab = keg_alloc_slab(keg, zone, domain, allocflags);
/*
* If we got a slab here it's safe to mark it partially used
* and return. We assume that the caller is going to remove
* at least one item.
*/
if (slab) {
->> MPASS(slab->us_keg == keg);
where 'slab' is at r29 and 'us_keg' should be the very first (0) field. Keg
should be r30:
> print $r30
c00003fffffd7000
The problem seem to be when the code is dereferencing slab(r29), which seems to
be causing the MCE.
This is the content of the value r30:
db> x $r30
0xc00003fffffd7000: c0000000
db>
0xc00003fffffd7004: 2af8bb8
But I am not able to dereference d29:
db> x $r29
0xc00003fffffddf90: (machine halts)
I am wondering why accessing this page is causing this problem.
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-bugs
mailing list