Re: Sudden zpool checksums errors

From: mike tancsa <mike_at_sentex.net>
Date: Fri, 04 Apr 2025 16:05:37 UTC
On 4/4/2025 11:42 AM, Andrea Venturoli wrote:
> Hello.
>
> I've got a box with two zpools:
> _ 1 mirror on 2 SSDs;
> _ 1 raidz1 on 12 HDDs.
>
> Suddenly one daily run showed the following:
>>  pool: backup
>>  state: ONLINE
>> status: One or more devices has experienced an unrecoverable error.  An
>>     attempt was made to correct the error.  Applications are unaffected.
>> action: Determine if the device needs to be replaced, and clear the 
>> errors
>>     using 'zpool clear' or replace the device with 'zpool replace'.
>>    see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
>>   scan: scrub repaired 3.18M in 16:53:16 with 0 errors on Tue Apr  1 
>> 20:16:55 2025
>> config:


I have had marginal power supplies, backplane issues or break out cables 
from the controller manifest errors like that.  I would check the power 
supply first, backplane next, controller 3rd. Common firmware bugs can 
cause issues too, but thats relatively rare and usually with SSDs, not 
HDDs from what I have seen in the past.

     ---Mike