SMP and NMI errors (4.10)

Paul Civati paul at xciv.org
Sun Jul 11 16:19:25 PDT 2004


Doug White <dwhite at gumbysoft.com> wrote:

> NMIs sometimes get triggered for ECC corrections, which is why a
> memtester wouldn't see it. 

Doh, yeah.

> I think there is a kernel option somewhere that hooks
> NMI and attempts to get information from the platform as to what DIMM
> triggered it.

Can't see anything for that, and there is only one DIMM, I have a 
second one to go in that I can swap to test.

> Otherwise you might check the Event Log in the BIOS for ECC events.

Alas no BIOS event log for this mobo.

> > If I boot a uniprocessor kernel this problem doesn't occur.
> It might be temperature related then :)

I was hoping no-one would say that :)

This is a 1U rack mount and currently seems to run at about ~40 deg. C
on CPU1, ~30 deg. C on CPU2 and ~33 deg. C system temp.  First CPU 
is a little hot perhaps but I don't think too high to be a problem?

-Paul-



More information about the freebsd-stable mailing list