head -r347003 on 2-socket/2-cores-each G5 PowerMac11,2's: one type of boot-blocking context found (CPU 1 evidence)
Mark Millard
marklmi at yahoo.com
Wed May 8 12:31:09 UTC 2019
This note deals with the "CPU 1" hangup evidence, CPU 2
related will be later sometime. They do not behave the
same. (These are what has been failing recently but need
not be long term stable CPU numbers.)
CPU 1 is the one that gets as far as trying:
sp = pcpup->pc_curpcb->pcb_sp;
in cpudep_ap_bootstrap but sometimes hangs-up attempting
the pc_curpcb-> part and cpudep_ap_bootstrap never finishes.
(I added code at the end that does not produce its result
in memory. The bsp times-out waiting for CPU 1 to become
awake --and so skips CPU 1.)
I'll note that slb index 0 is not assigned a V=1 status
on CPU 1 at this stage. Indexes 1-63 are. (These were
extracted live values, not from FreeBSD data structures.)
When it works, CPU 1 sees (showing where the
values come from --and it is a specific example
boot):
pcpup_value =0x197c380
pc_idlethread_value=0xc00000000224f580
td_pcb_value =0xe000000064beca90
pcb_sp_value =0xe000000064bec8f0
When it fails it can not get that last value
via the td_pcb_value:
pcpup_value =0x197c380
pc_idlethread_value=0xc00000000224f580
td_pcb_value =0xe000000064beca90
Note: pcpup values are not from the DMAP
space or the KVA space.
In the working case, the CPU 1 slbs already had:
(These were extracted live values, not from FreeBSD
data structures.)
39: esid_part= 0x8000000 vsid_part=0x1000000000000
17: esid_part=0xc000000008000000 vsid_part=0x10000ecc40100
61: esid_part=0xe000000068000000 vsid_part=0x100087a5a0000
(So no need for handle_kernel_slb_spill.)
In a failing case:
(These were extracted live values, not from FreeBSD
data structures.)
30: esid_part= 0x8000000 vsid_part=0x1000000000000
15: esid_part=0xc000000008000000 vsid_part=0x10000ecc40100
(no esid_part=0xe000000068000000)
(Others are similar.)
In the failing case, code I put in handle_kernel_slb_spill
to record to extra global variables does not happen.
As far as I can tell CPU 1 never gets to
handle_kernel_slb_spill at all. (Not that I've figured out
how to find where CPU 1 does get to.)
Side note for below DMAP_START based esid's:
39: esid_part= 0x8000000 vsid_part=0x1000000000000
53: esid_part=0x98000000 vsid_part=0x1000b19300000
54: esid_part=0xa8000000 vsid_part=0x1000c54e00000
51: esid_part=0xf8000000 vsid_part=0x100127f500000
and (different example):
30: esid_part= 0x8000000 vsid_part=0x1000000000000
52: esid_part=0x88000000 vsid_part=0x10009dd800000
38: esid_part=0x98000000 vsid_part=0x1000b19300000
54: esid_part=0xa8000000 vsid_part=0x1000c54e00000
and (different example):
50: esid_part=0x78000000 vsid_part=0x10008a1d00000
53: esid_part=0x98000000 vsid_part=0x1000b19300000
54: esid_part=0xa8000000 vsid_part=0x1000c54e00000
49: esid_part=0xf8000000 vsid_part=0x100127f500000
===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
More information about the freebsd-ppc
mailing list