FreeBSD 6.x CVSUP today crashes with zero load ...

M.Hirsch M.Hirsch at hirsch.it
Mon Jun 26 22:40:04 UTC 2006


Dmitry Pryanishnikov schrieb:

>
> Hello!
>
> On Mon, 26 Jun 2006, M.Hirsch wrote:
>
>> ECC is a way to mask broken hardware. I rather have my hardware fail 
>> directly when it does first, so I can replace it _immediately_
>
>
>  You got it backwards. If your data has any value to you, then you 
> don't want
> to miss any single-error bit in it, do you? If you're running hardware 
> w/o
> ECC, your single-bit error in your data will go to the disk unnoticed, 
> and you'll lose your data. With ECC, hardware will correct it. In 
> (rare) case of multiple-bit error ECC logic will generate NMI for you, 
> so you'll notice and "replace it _immediately_" instead of two weeks 
> ago when your archive wont extract.
>
Nope, I am right on track.
I do not want to lose any data. So I'd prefer a ECC error to raise a 
panic so I can replace the hardware ASAP.
Don't get me wrong, but tracking bugs in FreeBSD is quite more of an 
effort than "just" akquiring a new box...

>> What's your hardware good for if it passes a "test", but fails in 
>> production?
>
>
>  It's the way in what RAM will manifest single-bit errors: you run 
> memory test - it won't catch them, later in production you'll miss 
> this error because
> nothing will provide extra sanity check of your data.

Ok...
Does the standard fs, UFS2, do "extra sanity checks", then?

M.


More information about the freebsd-stable mailing list