FreeBSD 6.x CVSUP today crashes with zero load ...
dmitry at atlantis.dp.ua
Tue Jun 27 06:41:26 UTC 2006
On Tue, 27 Jun 2006, M.Hirsch wrote:
> Yes, the result may be correct.
If you're talking about single-bit error, you aren't quite correct. It isn't
"may be correct", it's _definitely_ correct (in mathematical sense; that it,
correcting code proves that we have one and only one error in bit number
N, hardware just inverts this bit, and result _is_ OK).
> 'Do not take "ECC" for "equals additional security"'
Not security. ECC adds reliability.
> But, in FreeBSD, the function is a result of hardware-level correction.
> Something that only kicks in in _real_ _serious_ situations.
> I just would like you (not specifically you, Dmitry) to aknowledge that
> broken RAM is worth a "panic" in "standard situations"- if I may call it like
The predominant RAM errors are exactly the single-bit ones. Moreover,
usually they _don't_ reappear again at the same cell. They (for example) may
be caused by the spontaneous alpha-radioactivity (brought into the your
computer by the usual dust) and as such don't indicate that RAM module must
be replaced. They just break your data in unpredictable way, not your
hardware. They (single-bit errors) are the main reason why ECC-capable memory
and chipset must be used in the computer which calculates/transfers actually
> If the RAM is broken for some bits, chances are great that there are more
> following soon.
If multiple-bit error happens, then yes, it can be the sign of actual
hardware fault. And yes, ECC logic will report this event instantly.
Atlantis ISP, System Administrator
e-mail: dmitry at atlantis.dp.ua
More information about the freebsd-stable