kern/59719 Re: 4.9 Stable Crashes on SuperMicro with SMP
Don Bowman
don at sandvine.com
Sat Nov 29 08:40:19 PST 2003
The following reply was made to PR kern/59719; it has been noted by GNATS.
From: Don Bowman <don at sandvine.com>
To: 'Uwe Doering' <gemini at geminix.org>,
freebsd-gnats-submit at FreeBSD.org
Cc: freebsd-bugs at freebsd.org, freebsd-stable at freebsd.org
Subject: RE: kern/59719 Re: 4.9 Stable Crashes on SuperMicro with SMP
Date: Sat, 29 Nov 2003 11:33:58 -0500
From: Uwe Doering [mailto:gemini at geminix.org]
> Jonathan Gilpin wrote:
> > I've run memtest (memtest86.com) kindly provided by Don and
> it passed all
> > the tests. I've installed installed a kernel module to test
> for memory
> > errors and found that again no memory errors are found...
> So this means it's
> > either a problem with the CPU's or a geniune bug in the
> kernel. (bugger!)
>
> No, that's unfortunately not what it means. If a memory test
> fails you
> can draw the conclusion that you have bad memory, but this
> doesn't work
> the other way round. If a memory test passes there is still a
> possibility that a memory chip is the culprit since memory
> test software
> cannot find all errors.
>
> Also, there is the chip set on the mainboard that coordinates
> bus access
> etc. for the two CPUs. Mainboard and chip set developers are
> known to
> make errors, too. In this case you would have to swap the entire
> mainboard, possible with one from a different manufacturer.
> I can tell
> you from my own experience that it is really hard to find reliable PC
> hardware these days, in light of ever shorter and faster
> product release
> cycles.
I have several hundred of the motherboard the poster is using,
and it works reliably with MP operation with 4.X.
The memtest86 that i sent him understands the ECC registers
on the e7501 MCH, it should find all correctable and uncorrectable
errors.
--don
More information about the freebsd-bugs
mailing list