ZVOL with volblocksize=64k+ results in data corruption [was Re: Resolving errors with ZVOL-s]

Thu Sep 21 20:00:11 UTC 2017

Hi,

I've conducted additional tests. It looks like when I create volumes with
following commands:
# zfs create -V50g -o volmode=dev -o volblocksize=64k -o compression=off -o
com.sun:auto-snapshot=false tank/test
# zfs create -V50g -o volmode=dev -o volblocksize=128k -o compression=off
-o com.sun:auto-snapshot=false tank/test

I'm able to get checksum errors quite reliably in 2-12h of normal work of
the volume. I tested also different volbocksizes:
# zfs create -V50g -o volmode=dev -o volblocksize=32k -o compression=off -o
com.sun:auto-snapshot=false tank/test
# zfs create -V50g -o volmode=dev -o volblocksize=8k -o compression=off -o
com.sun:auto-snapshot=false tank/test
# zfs create -V50g -o volmode=dev -o volblocksize=4k -o compression=off -o
com.sun:auto-snapshot=false tank/test

And gave them more than 24h of work with no apparent errors (I also moved
other volumes to 4k and they did not show any checksum errors for more than
2 weeks).

I was running with volblocksize=128k from January this year. The problem
started to appear only after I updated from 11.0 to 11.1.

Should I file bug report for this? What additional information should I
gather?

Cheers,

Wiktor Niesiobędzki

PS. I found a way to solve errors reporting in zpool status. It turns out,
that they disappear after scrub, but only if scrub was run substantially
later than zfs destroy. Maybe some references in ZIL prevent these errors
from being removed? Is this a bug?

2017-09-04 19:12 GMT+02:00 Wiktor Niesiobedzki <bsd at vink.pl>:

> Hi,
>
> I can follow up on my issue - the same problem just happened on the second
> ZVOL that I've created:
> # zpool status -v
>   pool: tank
>  state: ONLINE
> status: One or more devices has experienced an error resulting in data
>         corruption.  Applications may be affected.
> action: Restore the file in question if possible.  Otherwise restore the
>         entire pool from backup.
>    see: http://illumos.org/msg/ZFS-8000-8A
>   scan: scrub repaired 0 in 5h27m with 0 errors on Sat Sep  2 15:30:59 2017
> config:
>
>         NAME               STATE     READ WRITE CKSUM
>         tank               ONLINE       0     0    14
>           mirror-0         ONLINE       0     0    28
>             gpt/tank1.eli  ONLINE       0     0    28
>             gpt/tank2.eli  ONLINE       0     0    28
>
> errors: Permanent errors have been detected in the following files:
>
>         tank/docker-big:<0x1>
>         <0x5095>:<0x1>
>
>
> I suspect that these errors might be related to my recent upgrade to 11.1.
> Until 19 of August I was running 11.0. I consider rolling back to 11.0
> right now.
>
> For reference:
> # zfs get all tank/docker-big
> NAME             PROPERTY               VALUE                  SOURCE
> tank/docker-big  type                   volume                 -
> tank/docker-big  creation               Sat Sep  2 10:09 2017  -
> tank/docker-big  used                   100G                   -
> tank/docker-big  available              747G                   -
> tank/docker-big  referenced             10.5G                  -
> tank/docker-big  compressratio          4.58x                  -
> tank/docker-big  reservation            none                   default
> tank/docker-big  volsize                100G                   local
> tank/docker-big  volblocksize           128K                   -
> tank/docker-big  checksum               skein                  inherited
> from tank
> tank/docker-big  compression            lz4                    inherited
> from tank
> tank/docker-big  readonly               off                    default
> tank/docker-big  copies                 1                      default
> tank/docker-big  refreservation         100G                   local
> tank/docker-big  primarycache           all                    default
> tank/docker-big  secondarycache         all                    default
> tank/docker-big  usedbysnapshots        0                      -
> tank/docker-big  usedbydataset          10.5G                  -
> tank/docker-big  usedbychildren         0                      -
> tank/docker-big  usedbyrefreservation   89.7G                  -
> tank/docker-big  logbias                latency                default
> tank/docker-big  dedup                  off                    default
> tank/docker-big  mlslabel                                      -
> tank/docker-big  sync                   standard               default
> tank/docker-big  refcompressratio       4.58x                  -
> tank/docker-big  written                10.5G                  -
> tank/docker-big  logicalused            47.8G                  -
> tank/docker-big  logicalreferenced      47.8G                  -
> tank/docker-big  volmode                dev                    local
> tank/docker-big  snapshot_limit         none                   default
> tank/docker-big  snapshot_count         none                   default
> tank/docker-big  redundant_metadata     all                    default
> tank/docker-big  com.sun:auto-snapshot  false                  local
>
> Any ideas what should I try before rolling back?
>
>
> Cheers,
>
> Wiktor
>
> 2017-09-02 19:17 GMT+02:00 Wiktor Niesiobedzki <bsd at vink.pl>:
>
>> Hi,
>>
>> I have recently encountered errors on my ZFS Pool on my 11.1-R:
>> $ uname -a
>> FreeBSD kadlubek 11.1-RELEASE-p1 FreeBSD 11.1-RELEASE-p1 #0: Wed Aug  9
>> 11:55:48 UTC 2017     root at amd64-builder.daemonology
>> .net:/usr/obj/usr/src/sys/GENERIC  amd64
>>
>> # zpool status -v tank
>>   pool: tank
>>  state: ONLINE
>> status: One or more devices has experienced an error resulting in data
>>         corruption.  Applications may be affected.
>> action: Restore the file in question if possible.  Otherwise restore the
>>         entire pool from backup.
>>    see: http://illumos.org/msg/ZFS-8000-8A
>>   scan: scrub repaired 0 in 5h27m with 0 errors on Sat Sep  2 15:30:59
>> 2017
>> config:
>>
>>         NAME               STATE     READ WRITE CKSUM
>>         tank               ONLINE       0     0    98
>>           mirror-0         ONLINE       0     0   196
>>             gpt/tank1.eli  ONLINE       0     0   196
>>             gpt/tank2.eli  ONLINE       0     0   196
>>
>> errors: Permanent errors have been detected in the following files:
>>
>>         dkr-test:<0x1>
>>
>> dkr-test is ZVOL that I use within bhyve and indeed - within bhyve I have
>> noticed I/O errors on this volume. This ZVOL did not have any snapshots.
>>
>> Following the advice mentioned in action I tried to restore the ZVOL:
>> # zfs desroy tank/dkr-test
>>
>> But still errors are mentioned in zpool status:
>> errors: Permanent errors have been detected in the following files:
>>
>>         <0x5095>:<0x1>
>>
>> I can't find any reference to this dataset in zdb:
>>  # zdb -d tank | grep 5095
>>  # zdb -d tank | grep 20629
>>
>>
>> I tried also getting statistics about metadata in this pool:
>> # zdb -b tank
>>
>> Traversing all blocks to verify nothing leaked ...
>>
>> loading space map for vdev 0 of 1, metaslab 159 of 174 ...
>>         No leaks (block sum matches space maps exactly)
>>
>>         bp count:        24426601
>>         ganged count:           0
>>         bp logical:    1983127334912      avg:  81187
>>         bp physical:   1817897247232      avg:  74422     compression:
>> 1.09
>>         bp allocated:  1820446928896      avg:  74527     compression:
>> 1.09
>>         bp deduped:             0    ref>1:      0   deduplication:   1.00
>>         SPA allocated: 1820446928896     used: 60.90%
>>
>>         additional, non-pointer bps of type 0:      57981
>>         Dittoed blocks on same vdev: 296490
>>
>> And zdb got stuck using 100% CPU
>>
>> And now to my questions:
>> 1. Do I interpret correctly, that this situation is probably due to error
>> during write, and both copies of the block got checksum mismatching their
>> data? And if it is a hardware problem, it is probably something other than
>> disk? (No, I don't use ECC RAM)
>>
>> 2. Is there any way to remove offending dataset and clean the pool of the
>> errors?
>>
>> 3. Is my metadata OK? Or should I restore entire pool from backup?
>>
>> 4. I tried also running zdb -bc tank, but this resulted in kernel panic.
>> I might try to get the stack trace once I get physical access to machine
>> next week. Also - checksum verification slows down process from 1000MB/s to
>> less than 1MB/s. Is this expected?
>>
>> 5. When I work with zdb (as as above) should I try to limit writes to the
>> pool (e.g. by unmounting the datasets)?
>>
>> Cheers,
>>
>> Wiktor Niesiobedzki
>>
>> PS. Sorry for previous partial message.
>>
>>
>