PowerMac G5 hangs/crashes on boot: 10.2, 11.0-RCx

Mark Millard markmi at dsl-only.net
Fri Sep 9 20:48:16 UTC 2016


On 2016-Sep-9, at 11:36 AM, Krzysztof Parzyszek <kristof T swissmail.org> wrote:
> 
> On 9/9/2016 6:35 AM, Jukka Ukkonen wrote:
>> 
>> The story apparently goes such that the interrupt code shown can be
>> pretty much anything. The interrupts might simply be enabled way before
>> the system is ready to handle them.
> 
> I've had similar issues for quite some time.  Previous releases would boot only sometimes, otherwise I'd be getting a hang or a crash.  The frequency of the boot problems seems to increase dramatically when I boot from the hard-drive, but with 11 it has never booted correctly.
> 
> I wasn't the only one seeing this type of a problem and I remember seeing a thread about it a while back.  Mark Millard reported it, and someone has tracked it down to some register getting (unexpectedly) clobbered by the open firmware.  I was hoping this had been fixed, but it seems that things have only gotten worse...  :(
> 
> CCing Mark---maybe he will know more about this.
> 
> -Krzysztof

Unfortunately relative to powerpc and powerpc64: I've not had powerpc or powerpc64 access since very early 2016-June and will not for a few more weeks. (And, yes, the context is PowerMac's specifically.)

So I've done no testing of if my personal kernel hack (that made the PowerMac G5's boot reliably in my use) helps in any more modern FreeBSD variants. It is unlikely that I'll get to that point before October sometime. Until then I'll not be much direct help.

I'm the one that isolated memory and register corruption examples on PowerMac G5's before identifying my specific hack that I used to avoid them.

Beyond my reporting the hack in the lists I did submit a bugzilla report documenting what change made the observed difference in boot reliability (in the older context, anyway):

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=205458 (from 2015-Dec-20)

It reports as the technique:

> The change is in ofw_sprg_prepare of sys/powerpc/ofw/ofw_machdep.c and could look something like (presented in a form to show new/PowerMacG5-Specific code and old general code):
> 
> #ifdef POWERMAC_G5_SPECIFIC_BUILD
> 	__asm __volatile("mfsprg0 %0\n\t"
> 			 "mtsprg1 %1\n\t"
> 			 "mtsprg2 %2\n\t"
> 			 "mtsprg3 %3\n\t"
> 			 : "=&r"(ofw_sprg0_save)
> 			 : "r"(ofmsr[2]),
> 			 "r"(ofmsr[3]),
> 			 "r"(ofmsr[4]));
> #else
> // The historical code:
> 	__asm __volatile("mfsprg0 %0\n\t"
> 			 "mtsprg0 %1\n\t"
> 			 "mtsprg1 %2\n\t"
> 			 "mtsprg2 %3\n\t"
> 			 "mtsprg3 %4\n\t"
> 			 : "=&r"(ofw_sprg0_save)
> 			 : "r"(ofmsr[1]),
> 			 "r"(ofmsr[2]),
> 			 "r"(ofmsr[3]),
> 			 "r"(ofmsr[4]));
> #endif
> 
> In other words: for PowerMac G5's omit the mtsprg0 from ofmsr[1]: leave the register as it already is instead of resetting it. The value in ofmsr[1] is inappropriate to the context. I deliberately kept the change minimal and left in all other code related to the register.

All the evidence for this hack is observational. I've never figured out a reasonable way to find out what Apple's openfirmware does with the register involved and in what contexts. I wish I had better evidence for what is going on without the hack. The type of evidence that I have makes this purely a hack for now, even if it has a theory of operation justification (that is not known yet).

But as for the degree of observations: in isolating this I did well over 10,000 failing boots (spread over months, although not continuous activity). Frequently I'd have to try booting over a dozen times in a row before it would make it through. That is part of why the total is so large. After the hack I've not had any such failing boots up --but I boot far less frequently since I do not need to force a reboot. (I always buildworld buildkernel from source and my source has the hack.)

I've no post-early-2016-June evidence relative to the hack.

The lists have more information from as I investigated the issue, such as the memory and register corruptions that I observed prior to isolating the small change. But it is a mess to go through those notes in any detail. Not likely without a strong motivation.

I've no evidence that the change would be appropriate outside a PowerMac G5 at all. This alone would keep FreeBSD from adopting it in a generic build (even if there was a PowerMac G5 theory of operation justification known). The submittal only suggested having a pre-made hook for manually building from source for a PowerMac G5.

Part of the issue is that I do not know a way to identify the context as a PowerMac G5 context without use of openfirmware. Any use of openfirmware to figure that out would re-create the problem as far as I can tell. It appears that the build needs to be PowerMac G5 specific to avoid the problem.

I will note that I've never needed or used the hack on Powermac G4's or a PowerMac G3. But, again, my evidence ends in early-2016-June.

===
Mark Millard
markmi at dsl-only.net



More information about the freebsd-ppc mailing list