7.0-CURRENT Hang
Yar Tikhiy
yar at comp.chem.msu.su
Tue Feb 7 10:28:07 PST 2006
On Tue, Feb 07, 2006 at 10:06:32AM -0800, Cy Schubert wrote:
> In message <20060207173154.GE19674 at comp.chem.msu.su>, Yar Tikhiy writes:
> > On Mon, Feb 06, 2006 at 08:29:35PM -0800, Cy Schubert wrote:
> > >
> > > On the Pentium P54C model (that's an old 120 MHz Pentium I use as a 4.x,
> > > 5.x, and 7.x ports build testbed) the CPUID instruction when called with AL
> >
> > > = 0x02, CPUID returns EAX = EBX = ECX = EDX = 0. The code fragment in
> > > identcpu.c below results in "rounds" becoming 0xffffffff.
> > >
> > > do_cpuid(0x2, regs);
> > > rounds = (regs[0] & 0xff) - 1;
> > >
> > > The subsequent loop of the following will loop virtually for ever (it takes
> >
> > > forever tor this machine to count down from 0xffffffff performing a very
> > > great many calls to get_INTEL_TLB in the process, virtually hanging the
> > > machine in the process.
> > >
> > > while (rounds > 0) {
> > > [... code ...]
> > > rounds--;
> > > }
> >
> > FWIW, my presumably P54C machine (Family 5 Model 2 Stepping 6)
> > doesn't indicate it has the CPUID 0x02 function. That is, CPUID
> > 0x00 returns EAX = 0x01, which is the highest function supported.
> > Could you try to run the misc/cpuid port on your Pentium and show
> > its output? It might appear that the code around CPUID 0x02 shouldn't
> > be reached at all in your case. Zero values from CPUID 0x02 are
> > pretty indicative of that.
>
> Mine is Family 5 Model 2 Stepping 12. All of my doc is for Pentium-Pro and
> newer so you are probably correct.
Do you know what CPUID function 0x00 returns in EAX for your CPU?
Hint: just run misc/cpuid once and show its output here. I've just
fixed the port so that it has no bogus dependencies and is very
light-weight.
> > Dealing with "rounds" equal to -1 can be a good idea anyway to catch
> > braid dead CPUs instead of hanging the system on them.
>
> Well, with rounds = -1 [actually (unsigned int)0xffffffff], the CPU will
> "appear" to hang as it "rounds" or loops virtually forever -- counting back
> from 0xffffffff on a 120 MHz machine and performing get TLB info a number
> of times each iteration takes hours to do just a few iterations. I've seen
> mine go through "rounds", decrementing rounds-- each time, for hours at a
> time, though initially before digging into it using DDB it did appear that
> the CPU was hung, it was just starting to loop for 4,294,967,295 times. On
> older and slower machines, if it took hours to iterate through a few
> iterations, my guess is that it would take days to loop through this code.
> My patch allows it to take the defaults and finally boot. If the CPU
> doesn't support AL = 0x02, what's the point of looping? It appears to run
> nicely with the patch.
I do see that rounds = -1 is causing trouble.
I just meant that we should not call do_cpuid(0x02) at all if
(cpu_high < 2) because it can result in undefined behavior.
Your patch still makes sense because it deals with possible
brain-dead CPUs. I'd implement it in a slightly different
way though -- stay tuned! :-)
--
Yar
More information about the freebsd-current
mailing list