Expert input required: P4 odd signals, no apparent memory fault,
DISABLE_PSE?
Jan Grant
Jan.Grant at bristol.ac.uk
Mon Oct 20 08:40:11 PDT 2003
I'm tracking -STABLE on a 1.8GHz P4 with 512MB of memory. Roughly since
the PAE changes were MFCed, I've been seeing memory-corruption-related
errors under specific circumstances: for example, a run of
portsdb -fUu
can be guaranteed to generate SIGBUS, SIGILL and SIGSEGVs in a handful
of sh, sed, etc. processes.
However, reverting to a 4.8 kernel from prior to September either
hides/masks these errors, or no longer triggers them. The memory/mobo
_appears_ to check out OK under (ferinstance) extended runs of
memtest86.
Now, on -current I've seen reference to the DISABLE_PSE kernel option,
and some discussion that this behaviour may be due to a processor/timing
bug. So I have the following questions which I'd appreciate an expert
giving a definitive opinion on (I'm no x86/hardware hacker, me):
- are these problems potentially caused by this bug?
- what exactly does DISABLE_PSE do? (it's undocumented and a one-para
explanation of the expected behaviour of this option would be
appreciated)
- were any commits around the time of the MFC of the PAE code liable to
have introduced problems into the kernel which this workaround might
address?
I know it's a lot to ask, but both hardware and OS have been rock-solid
up until this point. Although I've not conclusively ruled out hardware
faults, the continued stability under high load of a pre-september 4.8
kernel makes me suspicious that this is more likely to be a bug getting
tickled than I'd normally suspect from these symptoms.
I'm about to experiment with this option but it currently feels a little
like cargo-cult admin. If there are any definitive tests that would
indicate if this hardware problem is present and addressed by this,
that's be nice to know too.
Cheers,
jan
--
jan grant, ILRT, University of Bristol. http://www.ilrt.bris.ac.uk/
Tel +44(0)117 9287088 Fax +44 (0)117 9287112 http://ioctl.org/jan/
"No generalised law is without exception." A self-demonstrating axiom.
More information about the freebsd-stable
mailing list