bad RAM? prove it with a crash dump?
Nate Eldredge
nate at thatsmathematics.com
Thu May 6 15:26:23 UTC 2010
On Thu, 6 May 2010, Andrew Duane wrote:
> It is also useful to make sure that the garbage itself is different. As
> mentioned before, a single bit error in an otherwise valid value, or
> maybe a missing/scrambled byte, these are good indications of memory
> problems. If random places are often overwritten with something else,
> that could just be another piece of misbehaving code that is writing
> someplace it shouldn't. I've often found code that writes some buffer
> into e.g. a piece of memory it no longer owns that looks like memory
> corruption until you realize the garbage is always something specific
> like a vnode structure.
There are trickier things too. I once had a machine with bad cache memory
where once in a while you would get a cache line that had come from
somewhere else in memory. This was particularly vexing when it happened
to an I/O buffer, and I wound up with a large zip file that had 32 bytes
of libc.so somewhere in the middle... :-(
And of course, swapping out the RAM wouldn't have fixed it.
--
Nate Eldredge
nate at thatsmathematics.com
More information about the freebsd-hackers
mailing list