Re: unusual ZFS issue

From: Xin LI <delphij_at_gmail.com>
Date: Thu, 14 Dec 2023 22:35:07 UTC
On Thu, Dec 14, 2023 at 2:29 PM Lexi Winter <lexi@le-fay.org> wrote:

> On 14 Dec 2023, at 22:25, Xin LI <delphij@gmail.com> wrote:
> > Try "zpool status -x" and see if it would show something useful?
>
> the output seems to be the same as ‘zpool status -v’:
>
> # zpool status -xv
>   pool: data
>  state: ONLINE
> status: One or more devices has experienced an error resulting in data
>         corruption.  Applications may be affected.
> action: Restore the file in question if possible.  Otherwise restore the
>         entire pool from backup.
>    see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
>   scan: scrub in progress since Thu Dec 14 18:58:21 2023
>         11.5T / 18.8T scanned at 962M/s, 8.71T / 18.8T issued at 726M/s
>         0B repaired, 46.41% done, 04:02:02 to go
> config:
>
>         NAME        STATE     READ WRITE CKSUM
>         data        ONLINE       0     0     0
>           raidz2-0  ONLINE       0     0     0
>             da4p1   ONLINE       0     0     0
>             da6p1   ONLINE       0     0     0
>             da5p1   ONLINE       0     0     0
>             da7p1   ONLINE       0     0     0
>             da1p1   ONLINE       0     0     0
>             da0p1   ONLINE       0     0     0
>             da3p1   ONLINE       0     0     0
>             da2p1   ONLINE       0     0     0
>         logs
>           mirror-2  ONLINE       0     0     0
>             ada0p4  ONLINE       0     0     0
>             ada1p4  ONLINE       0     0     0
>         cache
>           ada1p5    ONLINE       0     0     0
>           ada0p5    ONLINE       0     0     0
>
> errors: Permanent errors have been detected in the following files:
>

This is strange, I'd expect some non-zero values above; did you 'zpool
clear' before this?

Note that this is permanent damage (otherwise ZFS will automatically "heal"
the pool by overwriting the bad data with good copies, and your
applications shouldn't see these).  You can delete these files or datasets
and restore them from backup.  If you are not using ECC memory, they may
occasionally cause damage like this one if you are really unlucky, by the
way.

i think this is expected since -x just filters the output to show pools
> with errors?
>

Yeah.... -x shows only pools that are not healthy.  If you have only one
pool and that's the pool you are seeing issues, the output should be
identical.

Cheers,