ECC status in FreeBSD

Brett Glass brett at lariat.org
Mon Dec 20 14:43:24 PST 2004


At 03:25 PM 12/20/2004, Charles Swiger wrote:

>However, your RAM isn't a hard drive, so the ad-sector remapping used  
>by hard drives is not fully applicable.  Your machine is expected not  
>to have any part of memory fail reproducably, but if you do, it's time  
>to use the warranty and replace the entire chip.

It's true that RAM is not a hard drive. However, if the problem is with
certain memory cells rather than, say, the row or column drivers, the
rest of the chip is usable. And if you did want to scuttle the entire
module on which the chip resided, you'd probably want to disable that
module in the meantime by telling the system not to use it. Certainly,
you'd at least want to know which module was failing. There's nothing
to tell you that right now.

>ECC is a fine idea, but the motherboard chipset pretty much does  
>everything that is required (except for the reporting/syslogging), so  
>the kernel doesn't need to be specially involved for the system to  
>benefit from ECC protection.

Alas, right now there's no way to KNOW that you need to deal with a 
failing RAM module until you start experiencing random and possibly
destructive system panics or crashes. It'd be nice, at least, to see 
something in the logs or be able to collect statistics from the 
motherboard.

--Brett



More information about the freebsd-questions mailing list