Strange ZFS filesystem corruption

Paul Mather paul at gromit.dlib.vt.edu
Tue Oct 4 14:31:53 UTC 2011


On Oct 3, 2011, at 6:19 PM, Artem Belevich wrote:

> On Mon, Oct 3, 2011 at 11:21 AM, Paul Mather <paul at gromit.dlib.vt.edu> wrote:
>> =====
>> 
>> The pool itself reports no errors.  I performed a scrub on the pool yet this bizarre filesystem corruption persists:
>> 
>> =====
>> tape# zpool status backups
>>  pool: backups
>>  state: ONLINE
>>  scan: scrub repaired 15K in 7h33m with 0 errors on Sat Oct  1 19:22:35 2011
> 
> The pool *did* report 15K errors that it was able to repair.
> 
> I'd start with testing your RAM with memtest86 or memtest86+. ZFS
> errors without reported checksum errors may be the sign of bad memory.
> I.e. data gets corrupted before ZFS gets to calculate checksum and
> later invalid data with valid checksum gets written to disk.


Because this machine has ECC RAM, I checked the BIOS logs for ECC errors (the BIOS is set to log them) and there are no ECC errors logged.  If the RAM were going bad, I would expect it to leave some kind of trace in the BIOS log.

Do uncorrectable ECC errors get logged as MCEs under FreeBSD 9?

I've never noticed any problems when doing a "make -j8 buildworld" on this machine, either.

Cheers,

Paul.


More information about the freebsd-current mailing list