ZFS RAID 0+1 Throwing Checksum Errors

Sean Chittenden sean at chittenden.org
Mon Nov 9 19:54:24 UTC 2015


Tim, I've run into this a dozen or so times on servers where their power is "dirty" (i.e. home or small offices with small servers that use ZFS).  If you plug the box into a UPS to condition the line you may find that the checksum errors go away.  It's pretty amazing to see and happens with both SSD and spinning rust.  It's not always the case, but it's a common enough environmental problem.  Report back if you try this and it solves your problem.

-sc



--
Sean Chittenden
sean at chittenden.org

> On Nov 9, 2015, at 11:08, Tim Gustafson <tjg at ucsc.edu> wrote:
> 
> I have a FreeBSD 10.1 server configured as root-on-zfs with the
> following pool configuration:
> 
> NAME            STATE     READ WRITE CKSUM
> tank           ONLINE       0     0     0
> mirror-0      ONLINE       0     0     0
>   gpt/zfs0    ONLINE       0     0     0
>   gpt/zfs1    ONLINE       0     0     0
> mirror-1      ONLINE       0     0     0
>   gpt/zfs2    ONLINE       0     0     0
>   gpt/zfs3    ONLINE       0     0     0
> 
> The disks are each 1TB Samsung 850EVO SSDs connected via an mrsas Dell
> Perc raid controller configured in "RAID Disabled" mode.
> 
> I run a "zpool scrub" every weekend and every weekend the scrub finds
> a handful (usually between 1 and 10) checksum errors per disk.  The
> scrub fixes the checksum errors, and I clear the counters and
> everything seems fine.  As far as I know, I do not have any corrupt or
> missing data.
> 
> The server is a fairly busy web and database server, handling about 5
> million hits per day.
> 
> I'm wondering if the problem is that the scrub is calculating the
> checksum for the data on gpt/zfs0, and while that's happening, some
> data is updated by Apache or MySQL, and then checksum for the data on
> gpt/zfs1 is calculated, which now doesn't match, and therefore the
> scrub is reporting an error.  Is that possible?
> 
> If that's not it, could this be a bug?  Or should I be worried about
> my SSDs?  What additional data would be helpful for me to share to
> diagnose this?
> 
> -- 
> 
> Tim Gustafson
> Technical Lead, Baskin School of Engineering
> tjg at ucsc.edu
> 831-459-5354
> Baskin Engineering, Room 313A
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"



More information about the freebsd-fs mailing list