Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75

From: Mateusz Guzik <mjguzik_at_gmail.com>
Date: Wed, 12 Apr 2023 16:46:50 UTC
On 4/12/23, FreeBSD User <freebsd@walstatt-de.de> wrote:
> Am Wed, 12 Apr 2023 11:51:09 -0400
> Charlie Li <vishwin@freebsd.org> schrieb:
>
>> Cy Schubert wrote:
>> > I have a "sandhbox" pool, called t, used for /usr/obj and ports wrkdirs,
>> > and other writes
>> > I can easily recreate on my laptop. Here are the results of my tests.
>> >
>> > Method:
>> >
>> > Initially I copied my /usr/obj from my two build machines (one
>> > amd64.amd64 and an
>> > i386.i386) to my "sandbox" zpool.
>> >
>> > Next, with block_cloning disabled I did cp -R of the /usr/obj test
>> > files. Then a diff -qr.
>> > They source and target directories were the same.
>> >
>> > Next, I cleaned up (rm -rf) the target directory to prepare for the
>> > block_clone enabled test.
>> >
>> > Next, I did zpool checkpoint t. After this, zpool upgrade t. Pool t now
>> > has block_cloning
>> > enabled.
>> >
>> > I repeated the cp -R test from above followed by a diff -qr. Almost
>> > every file was different. The pool was corrupted.
>> >
>> > I restored the pool by the following removing the corruption:
>> >
>> >
>> > slippy# zpool export t
>> > slippy# zpool import --rewind-to-checkpoint t
>> > slippy#
>> >
>> > It is recommended that people avoid upgrading their zpools until the
>> > problem is fixed.
>> >
>> As of af7624ed3145, I just did this with an md(4)-backed test pool,
>> though with the second `cp -R` landing in a separate dataset, created
>> and destroyed for each test. No corruption either way. However, my
>> poudriere builds still output/package corrupted files (particularly
>> those with null characters), probably after install(1) invocations (not
>> cp(1)).
>>
>
> I still have corrupt files on the /usr/ports tree (located on ZFS, with
> feature@block_cloning  active):
>
> [...]
> Installing man pages and online manual
> mkdir /usr/ports/www/apache24/work/stage/usr/local/share/doc/apache24
> cd /usr/ports/www/apache24/work/httpd-2.4.57/docs/manual && cp -rp *
> /usr/ports/www/apache24/work/stage/usr/local/share/doc/apache24 install  -m
> 0644
> /usr/ports/www/apache24/files/no-accf.conf
> /usr/ports/www/apache24/work/stage/usr/local/etc/apache24/Includes/ install
> -m 0644
> /usr/ports/www/apache24/files/README_modules.d
> /usr/ports/www/apache24/work/stage/usr/local/etc/apache24/modules.d/
> /usr/bin/strip
> /usr/ports/www/apache24/work/stage/usr/local/libexec/apache24/mod_*.so
> /bin/rm -f
> /usr/ports/www/apache24/work/stage/usr/local/share/apache24/build/ecp.????????
> 2>/dev/null
> install  -m 555
> /usr/ports/www/apache24/work/httpd-2.4.57/support/check_forensic
> /usr/ports/www/apache24/work/stage/usr/local/sbin ====> Compressing man
> pages (compress-man)
> ===> Staging rc.d startup script(s) ===>  Installing for apache24-2.4.57
> ===>   Registering
> installation for apache24-2.4.57 pkg-static:
> pkg_checksum_hash_sha256_file(read failed):
> Input/output error pkg-static: pkg_checksum_hash_sha256_file(read failed):
> Input/output error
> pkg-static: pkg_checksum_hash_sha256_file(read failed): Input/output error
> pkg-static:
> pkg_checksum_hash_sha256_file(read failed): Input/output error pkg-static:
> pkg_checksum_hash_sha256_file(read failed): Input/output error pkg-static:
> pkg_checksum_hash_sha256_file(read failed): Input/output error
>
> www/apache24 is now ALWAYS droping this corruption, even after scrubbing the
> pool.
>
> This one is the same in my case:
>
> [...]
>
> cd /usr/ports/devel/ruby-gems/work/stage/usr/local/ && /usr/bin/find -ds
> lib/ruby/gems/3.1/doc/ ! -type d >>
> /usr/ports/devel/ruby-gems/work/.PLIST.mktmp ====>
> Compressing man pages (compress-man) ===>>> Starting check for runtime
> dependencies
> ===>>> Gathering dependency list for devel/ruby-gems from ports
> ===>>> Dependency check complete for devel/ruby-gems
>
> ===>>> All >> rubygem-addressable-2.8.1 >> devel/ruby-gems (3/27)
>
> ===>  Installing for ruby31-gems-3.4.10
> ===>   Registering installation for ruby31-gems-3.4.10 as automatic
> pkg-static: pkg_checksum_hash_sha256_file(read failed): Input/output error
> pkg-static: pkg_checksum_hash_sha256_file(read failed): Input/output error
> pkg-static: pkg_checksum_hash_sha256_file(read failed): Input/output error
> pkg-static: pkg_checksum_hash_sha256_file(read failed): Input/output error
> *** Error code 1
>
> Stop.
> make[1]: stopped in /usr/ports/devel/ruby-gems
>
>
> Pool is then marked corrupt (was scrubbed after the last corruption):
>
> [...]
>
>   pool: POOL00
>  state: ONLINE
> status: One or more devices has experienced an error resulting in data
>         corruption.  Applications may be affected.
> action: Restore the file in question if possible.  Otherwise restore the
>         entire pool from backup.
>    see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
>   scan: scrub in progress since Wed Apr 12 18:07:02 2023
>         1.45T scanned at 2.01G/s, 139G issued at 193M/s, 13.2T total
>         0B repaired, 1.02% done, 19:49:53 to go
> config:
>
>         NAME              STATE     READ WRITE CKSUM
>         POOL00          ONLINE       0     0     0
>           raidz1-0        ONLINE       0     0     0
>             gpt/pool00  ONLINE       0     0     0
>             gpt/pool01  ONLINE       0     0     0
>             gpt/pool02  ONLINE       0     0     0
>             gpt/pool03  ONLINE       0     0     0
>
> errors: 22 data errors, use '-v' for a list
>
> [...]
>
> errors: Permanent errors have been detected in the following files:
>
>
> /usr/ports/devel/ruby-gems/work/stage/usr/local/lib/ruby/site_ruby/3.1/rubygems/optparse/lib/optionparser.rb
>
> /usr/ports/devel/ruby-gems/work/stage/usr/local/lib/ruby/site_ruby/3.1/rubygems/optparse.rb
>
> /usr/ports/www/apache24/work/stage/usr/local/www/apache24/icons/small/blank.gif
>
> /usr/ports/devel/ruby-gems/work/stage/usr/local/lib/ruby/site_ruby/3.1/rubygems/resolver/molinillo.rb
>
> /usr/ports/www/apache24/work/stage/usr/local/share/doc/apache24/images/left.gif
>
> /usr/ports/www/apache24/work/stage/usr/local/share/doc/apache24/images/right.gif
>
> /usr/ports/www/apache24/work/stage/usr/local/share/doc/apache24/images/down.gif
>
> /usr/ports/www/apache24/work/stage/usr/local/share/doc/apache24/images/pixel.gif
>
> /usr/ports/devel/ruby-gems/work/stage/usr/local/lib/ruby/site_ruby/3.1/rubygems/tsort.rb
>
> /usr/ports/www/apache24/work/stage/usr/local/share/doc/apache24/images/up.gif
>
>
> --
> O. Hartmann
>

https://github.com/openzfs/zfs/pull/14739/files

here is a fix you can apply on top of sys/contrib/openzfs, i have no
idea how much it can do though

-- 
Mateusz Guzik <mjguzik gmail.com>