Opteron ECC

James Van Artsdalen james at jrv.org
Sun Feb 22 22:33:01 PST 2004


> From: Peter Wemm <peter at wemm.org>
> Date: Sun, 22 Feb 2004 21:47:13 -0800
> 
> On Sunday 22 February 2004 09:01 pm, James Van Artsdalen wrote:
> > It turns out that AMD has published its Opteron errata sheet and
> > errata item 101 appears to be the issue: a bug in the Opteron means
> > you can't have both "node interleave" and ECC scrubbing on at the
> > same time.
> 
> Oh my, thats  bit of a stinker.  Do you recall which steppings this 
> applies to?
> 
> BTW; I suspect you might find that node interleave is more useful (speed 
> wise) than ecc background scrubbing.   But I guess that depends on what 
> you want..  If you're trying to wring every bit of performance out of 
> it, pick node interleave over scrubbing.  On the other hand, if you'd 
> perfer to have the system constantly checking that the ECC ram is ok 
> and you're not so worried about speed, then pick scrubbing. 

The errata list is here:
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25759.pdf

Bug 101 affects both B3 and C0 steppings.

A word for people reading processor errata for the first time: here's
a comment I made on another list:

   This errata list may look gruesome to those not used to such
   things, but it's not bad at all. I dealt with processor errata
   lists from Intel for years as a PC designer - the double-secret-NDA
   lists - and this is par for the course, perhaps even cleaner than
   usual.

It might not hurt to add a line of code to the kernel to check for these
steppings, node interleave and scrubbing, and print a warning if all
three are met.

There is little difference between a stored 1 and 0 in a modern DRAM.
I'm not sure what the rate of error accumulation is.  If you have high
density DRAM and might not touch some for periods of time, scrubbing
might prevent two errors from accumulating in one line.


More information about the freebsd-amd64 mailing list