PowerMac G5 powerpc64: new context where repeatedly booting varies between failing and working

Mark Millard markmi at dsl-only.net
Wed Feb 18 05:35:03 UTC 2015


[I had sent Nathan W. and Justin H. a picture of a display of a boot-time corrupted memory region. This time I tried to find the start and end of the region and I'm documenting in a textual form more appropriate to the list. I have also removed prior Email history from this Email but there is much context one must check that history for.]

Several of the new values put in place by the .got memory corruption reported below match up with .opd or other types of addresses reported by objdump for my /boot/kernel10.1S/kernel. They are noted below as I list detailed differences.

I made the early-boot-crash display a larger range and the span of the corruption seemed to go as follows for the corruption of part of the .got area. Also I induced a deference of the bad pointer as soon as it is discovered after the OF_peer(0) in question returns so later code would not be involved when it crashes. (Crash early, crash often...)


Overall structure:

0xd2da37 and before as far as I looked: no corruption found.

The area from 0xd2da38-0xd2dc9F: largely corrupted. 0x268 or 616 bytes or so in this corrupted range. 616=77*8.

After that range: good again as far as I looked.


The details:

Warning: The below is based on hand transcribed information from screen pictures that I took.

Showing pair of lines (good then corrupted), using x/x style lines:

0xd2da30: 0, b4fd2c, 0, b4fd70
0xd2da30: 0, b4fd2c, 0,      0

0xd2da40: 0,   e28948, 0, e1e460
0xd2da40: 0, 24000042, 0, d00058
(24000042 looks like a cr value?)
(0000000000d00058 l       .opd   0000000000000018 ofw_rendezvous_dispatch)

0xd2da50: 0, bc7de8,        0, bc7e08
0xd2da50: 0, cde110, c0000000,   8740
(0xc000000000008740 looks like a stack address?)
(0000000000cde110 g     F .opd   0000000000000018 smp_no_rendevous_barrier)

0xd2da60: 0, cd8470, 0, bd2608
0xd2da60: 0,      1, 0, c3a30c
(0000000000c3a30c g       .data  0000000000000000 ofw_sprg0_save)

0xd2da70: 0,  bb5ea0, 0, b70870
0xd2da70: 0, 1c35ec0, 0,      0

0xd2da80: 0,   c49918, 0, bc7e18
0xd2da80: 0, 44000022, 0, de4b30
(44000022 looks like a cr value?)
(0000000000de4b30 g     O .bss   0000000000000460 thread0)

0xd2da90:         0, b720a0, 0,   b71370
0xd2da90: 900000000,   1032, 0, ff846d78
(9000000000001032 looks like a SRR1 value.)
(ff846d78 is openfirmware entry point?)

0xd2daa0: 0, bc7e30,         0,   bc7e58
0xd2daa0: 0, e39080, 100000000,   3030
(0000000000e39080 g     O .bss   0000000000020000 __pcpu)
(1000000000003030 looks like a SRR1 value?)

0xd2dab0:        0, bc7e80, 0, bc7eb0
0xd2dab0: c0000000,   83b0, 0, c3a280
(0xc0000000000083b0 looks like a stack address?)
(c3a280 is inside my PowerMac G5 specific hack's ofwstk area: c392a0 up to 0x3a2a0)
(I've been gathering evidence about early-boot G5 crashes.)

0xd2dac0: 0, bc7ed0, 0, cf2960
0xd2dac0: 0, c40000, 0, c40000

0xd2dad0: 0, bc7f00, 0, bc7f28
0xd2dad0: 0, c40000, 0, c40000

0xd2dae0:        0, b72400, 0, bc7f28
0xd2dae0: c0000000,   8740, 0, cde110
(0xc000000000008740 looks like a stack address?)
(0000000000cde110 g     F .opd   0000000000000018 smp_no_rendevous_barrier)

0xd2daf0: 0, cf2b28, 0, b716a0
0xd2daf0: 0, d00058, 0, cde110
(d00058 was also at 0xd2da4c and was followed by cde110 there.)
(0000000000cde110 g     F .opd   0000000000000018 smp_no_rendevous_barrier)

0xd2db00: 0, cf2b88, 0, cf2b70
0xd2db00: 0, e6c280, 0,      0
(e6c280 is inside the emergency_buffer.7752 area: e6c278 up to e6c378)

0xd2db10:         0, cf2b58,        0, 8480
0xd2db10: 900000000,   1032, c0000000, 8740
(9000000000001032 looks like a SRR1 value?)
(0xc000000000008740 looks like a stack address?)

0xd2db20: 0, c2d920, 0, cf2b10
0xd2db20: 0, c2d920, 0, cf2b10 (yep: unchanged!)

0xd2db30: 0,   b71718,        0, c49888
0xd2db30: 0, ff846734, 10000000,   3030
(ff846734 would seem to be an openfirmware code address?)
(1000000000003030 looks like a SRR1 value?)

0xd2db40: 0, c498a0, 0,   c54000
0xd2db40: 0, c498a0, 0, ff846d78
(Yep: c498a0 was unchanged)
(ff846d78 is openfirmware entry point?)

0xd2db50:        0, e313a8, 0, e31608
0xd2db50: 24000042, e313a8, 0,      0
(24000042 looks like a cr value?)
(Trying to store to address 0x2400004200e313a8 for a specific
type of 10.1-STABLE build is how the problem was originally
noticed.)

0xd2db60: 0, c31f80, 0, bc81e8
0xd2db60: 0, c31f80, 0,      0
(Yep: 0x0000000000c31f80 is unchanged.)

0xd2db70:      0, e31408, 0, bc8228
0xd2db70: 200000, e31408, 0, bc8228
(Yep: Only the 0x200000 was a change.)

0xd2db80: 0, c32488,        0, bc8238
0xd2db80: 0,      1, 10000000,   3030
(1000000000003030 looks like a SRR1 value?)

0xd2db90: 0, e1e460, 0,   c31fc0
0xd2db90: 0,      0, 0, 7ff7e800

0xd2dba0: 0,   e31608, 0, bc8260
0xd2dba0: 0, 1000000a, 0, bc8260
(Yep: 0x0000000000bc8260 unchanged.)

0xd2dbb0: 0, e1e460, 0, e1fa60
0xd2dbb0: 0, e1e460, 0, e1fa60 (yep: unchanged!)

0xd2dbc0:      0, bc8288,        0, c32488
0xd2dbc0: 111081,      0, fd3c2000,      0
(fd3c2000 in openfirmware area?)

0xd2dbd0: 0, e3153c, 0, bc8298
0xd2dbd0: 10,     0, 0,      0

Now a few unchanged: 0xd2de0-0xd2dc1F

Then a change in the pattern of corruptions for the rest of the corrupted area:

0xd2dc20: 0, bc8288,       0, bc82e8
0xd2dc20: 0, bc8288, 127f500, bc82e8

Note how bc8288 and bc82e8 did not change.
From here on those two columns are not
corrupted but the other two are.

0xd2dc30:       0, bc8300,      0, c32488
0xd2dc30: 8000000, bc8300, e7d540, c32488

0xd2dc40:     0, b4fef0,       0, e31558
0xd2dc40: ecc40, b4fef0, 84eec80, e31558

0xd2dc50:       0, bc8308,       0, cf2f00
0xd2dc50: 1e85440, bc8308, 8766200, cf2f00

0xd2dc60:      0, bc8310,       0, bc8350
0xd2dc60: fb9040, bc8310, 93bb000, bc8350

0xd2dc70:       0, c32038,       0, de5718
0xd2dc70: 94f6b00, c32038, 8632600, de5718

0xd2dc80:       0, de7768,       0, bc3760
0xd2dc80: 1fc0f40, de7768, 10f4b40, bc3760

0xd2dc90:       0, de7768,      0, e1fa00
0xd2dc90: 99e5700, cfc658, 228740, e1fa00

And after that things match for as far as I've looked: no corruptions.





===
Mark Millard
markmi at dsl-only.net




More information about the freebsd-ppc mailing list