Sudden zpool checksums errors

From: Andrea Venturoli <ml_at_netfence.it>
Date: Fri, 04 Apr 2025 15:42:26 UTC
Hello.

I've got a box with two zpools:
_ 1 mirror on 2 SSDs;
_ 1 raidz1 on 12 HDDs.

Suddenly one daily run showed the following:
>  pool: backup
>  state: ONLINE
> status: One or more devices has experienced an unrecoverable error.  An
> 	attempt was made to correct the error.  Applications are unaffected.
> action: Determine if the device needs to be replaced, and clear the errors
> 	using 'zpool clear' or replace the device with 'zpool replace'.
>    see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
>   scan: scrub repaired 3.18M in 16:53:16 with 0 errors on Tue Apr  1 20:16:55 2025
> config:
> 
> 	NAME        STATE     READ WRITE CKSUM
> 	backup      ONLINE       0     0     0
> 	  raidz1-0  ONLINE       0     0     0
> 	    da4     ONLINE       0     0     0
> 	    da10    ONLINE       0     0     0
> 	    da5     ONLINE       0     0    57
> 	    da2     ONLINE       0     0     0
> 	    da8     ONLINE       0     0    25
> 	    da0     ONLINE       0     0     0
> 	    da1     ONLINE       0     0    49
> 	    da12    ONLINE       0     0     8
> 	    da6     ONLINE       0     0     6
> 	    da11    ONLINE       0     0     0
> 	    da9     ONLINE       0     0    56
> 	    da13    ONLINE       0     0    73
> 
> errors: No known data errors

I'm finding it hard to believe that 7 disks out of 12 are failing or 
just happened to misbehave all on the same day.
BTW, SMART says they are OK.

I'm reluctant to blame RAM (since it's ECC) and power supply (as it's 
redundant 2x800W).
Disks are 16TB TOSHIBA MG09ACA1 connected to a MegaRAID SAS-3 3108 (of 
course not operating as RAID and with mrsas driver).

% freebsd-version
14.2-RELEASE-p2
% zfs --version
zfs-2.2.6-FreeBSD_g33174af15
zfs-kmod-2.2.6-FreeBSD_g33174af15


Is there a known ZFS bug that could explain this?

I've "zpool clear"ed the errors and waiting to see if they come up again.

  bye & Thanks
	av.

P.S.
Also, I'm quite sure no "administrator accidentally wrote over a portion 
of the disk using another program" :)