ZFS RAID 0+1 Throwing Checksum Errors

Tim Gustafson tjg at ucsc.edu
Mon Nov 9 20:07:44 UTC 2015


> It could be the SSDs, the controller, cables, or power supply.
> The problem might occur when the data is written, or when it is
> read back. If this is occuring on all of the SSDs then look for
> some shared component which might be causing the problem.

I'll check into these possibilities.  Now that I think of it, I should
check to see of the Perc card has its own log that I could look at,
even though we're not using its RAID features.

> Tim, I've run into this a dozen or so times on servers where their
> power is "dirty" (i.e. home or small offices with small servers that
> use ZFS).  If you plug the box into a UPS to condition the line you
> may find that the checksum errors go away.  It's pretty amazing
> to see and happens with both SSD and spinning rust.  It's not
> always the case, but it's a common enough environmental problem.
> Report back if you try this and it solves your problem.

I'm pretty sure that's not the case.  This server is a new-ish (3
months old?) Dell R630 with redundant power supplies, connected to an
industrial APC UPS, connected to an industrial diesel backup
generator, housed in a real bone-fide server room with industrial
cooling.  There's about 80 other physical servers in this room room,
and none of the others are exhibiting this behavior.

-- 

Tim Gustafson
Technical Lead, Baskin School of Engineering
tjg at ucsc.edu
831-459-5354
Baskin Engineering, Room 313A


More information about the freebsd-fs mailing list