Re: CURRENT: Panic VERIFY(!zil_replaying(zilog, tx)) failed (and crashing)

From: Nuno Teixeira <eduardo_at_freebsd.org>
Date: Wed, 12 Apr 2023 13:16:07 UTC
(...)

Trying :

`zpool set feature@block_cloning=disabled zroot`:
cannot set property for 'zroot': property 'feature@block_cloning' can only
be set to 'disabled' at creation time

Nuno Teixeira <eduardo@freebsd.org> escreveu no dia quarta, 12/04/2023 à(s)
13:57:

> Hello all,
>
> at current 3fdb40d1befe after `zfs upgrade XXX`:
>
> same problem when running compiler:
>
> - poudriere: crash without dump
> - make buildworld (/usr/src): shutdown -p (I will try to get a photo)
>
> Is there a way to disable block clone?
>
> Cy Schubert <Cy.Schubert@cschubert.com> escreveu no dia terça, 11/04/2023
> à(s) 15:47:
>
>> In message <20230411142831.DB8245FA@slippy.cwsent.com>, Cy Schubert
>> writes:
>> > In message <434B83DB-F6BB-436F-8AA5-385730D20BB1@dawidek.net>,
>> > =?utf-8?Q?Pawe=C
>> > 5=82_Jakub_Dawidek?= writes:
>> > >
>> > >
>> > > > On Apr 11, 2023, at 11:31, Cy Schubert <Cy.Schubert@cschubert.com>
>> wrote:
>> > > >=20
>> > > > =EF=BB=BFIn message
>> <20230409161436.5412fa6e@thor.intern.walstatt.dynvpn.
>> > d=
>> > > e>,=20
>> > > > FreeBSD Us
>> > > > er writes:
>> > > >> Am Sun, 9 Apr 2023 14:37:03 +0200
>> > > >> Mateusz Guzik <mjguzik@gmail.com> schrieb:
>> > > >>=20
>> > > >>>> On 4/9/23, FreeBSD User <freebsd@walstatt-de.de> wrote:
>> > > >>>>> Today, after upgrading to FreeBSD 14.0-CURRENT #8
>> main-n262052-0d4038
>> > e=
>> > > 301
>> > > >>> 2b:
>> > > >>>>> Sun Apr  9
>> > > >>>>> 12:01:02 CEST 2023  amd64, AND upgrading ZPOOLs via
>> > > >>>>>=20
>> > > >>>>> zpool upgrade POOLNAME
>> > > >>>>>=20
>> > > >>>>> some boxes keep crashing when starting compiler runs (the
>> trigger is
>> > > >>>>> different on boxes).
>> > > >>>>>=20
>> > > >>>>> ZFS module is statically compiled into the kernel (if this is of
>> > > >>>>> importance)
>> > > >>>>>=20
>> > > >>>>> Last known good was:
>> > > >>>>>=20
>> > > >>>>> [...]
>> > > >>>>> Apr  9 07:10:04 <0.2> thor kernel: FreeBSD 14.0-CURRENT #7
>> > > >>>>> main-n262051-75379ea2e461: Sun Apr
>> > > >>>>> 9 00:12:57 CEST 2023 Apr  9 07:10:04 <0.2> thor kernel:
>> > > >>>>> root@thor:/usr/obj/usr/src/amd64.amd64/sys/THOR amd64 Apr  9
>> 07:10:04
>> >  <
>> > > =
>> > > 0.
>> > > >>> 2>
>> > > >>>>> thor kernel:
>> > > >>>>> FreeBSD clang version 15.0.7 (
>> https://github.com/llvm/llvm-project.gi
>> > t=
>> > >
>> > > >>>>> llvmorg-15.0.7-0-g8dfdcc7b7bf6) Apr  9 07:10:04 <0.2> thor
>> kernel:
>> > > >>>>> VT(efifb): resolution
>> > > >>>>> 2560x1440 Apr  9 07:10:04 <0.2> thor kernel: module zfsctrl
>> already
>> > > >>>>> present!
>> > > >>>>> [...]
>> > > >>>>>=20
>> > > >>>>> The file /var/crash/info.X
>> > > >>>>>=20
>> > > >>>>> contains:
>> > > >>>>>=20
>> > > >>>>> [...]
>> > > >>>>>=20
>> > > >>>>> root@thor:/var/crash # more info.2
>> > > >>>>> Dump header from device: /dev/gpt/swap
>> > > >>>>>  Architecture: amd64
>> > > >>>>>  Architecture Version: 2
>> > > >>>>>  Dump Length: 1095192576
>> > > >>>>>  Blocksize: 512
>> > > >>>>>  Compression: none
>> > > >>>>>  Dumptime: 2023-04-09 11:43:41 +0000
>> > > >>>>>  Hostname: thor.local
>> > > >>>>>  Magic: FreeBSD Kernel Dump
>> > > >>>>>  Version String: FreeBSD 14.0-CURRENT #8
>> main-n262052-0d4038e3012b: S
>> > u=
>> > > n=20
>> > > >>> Apr
>> > > >>>>> 9 12:01:02 CEST
>> > > >>>>> 2023
>> > > >>>>>    root@thor:/usr/obj/usr/src/amd64.amd64/sys/THOR
>> > > >>>>>  Panic String: VERIFY(!zil_replaying(zilog, tx)) failed
>> > > >>>>>=20
>> > > >>>>>  Dump Parity: 2961465682
>> > > >>>>>  Bounds: 2
>> > > >>>>>  Dump Status: good
>> > > >>>>>=20
>> > > >>>>> Until reconfigured for more debug stuff I do not have more to
>> present
>> > .=
>> > >
>> > > >>>>>=20
>> > > >>>>> I rememeber now really scraed that there was a HEADSUP in the
>> list re
>> > g=
>> > > ard
>> > > >>> ing
>> > > >>>>> some serious ZFS
>> > > >>>>> problems - I didn't find it right now.
>> > > >>>>>=20
>> > > >>>>> Thanks in advance,
>> > > >>>>>=20
>> > > >>>=20
>> > > >>> That's fallout from the new block cloning feature, adding the
>> author
>> > > >>>=20
>> > > >>=20
>> > > >> Thanks.
>> > > >>=20
>> > > >> As of this moment, all systems with the newest kernel and the new
>> ZFS op
>> > t=
>> > > ion=20
>> > > >> enabled, crash -
>> > > >> the reason is mostly in  different ZFS datasets. I guess there is
>> no way
>> >  b
>> > > =
>> > > ack
>> > > >> once this faulty
>> > > >> option is enabled?
>> > > >=20
>> > > > I've run a test on a scratch pool here, first without
>> block_cloning=20
>> > > > enabled, then with. There was no corruption when block_cloning
>> was=20
>> > > > disabled. There was corruption when block_cloning was enabled.
>> > > >=20
>> > > > I don't know of any way to revert back nor is there any way to fix
>> or=20
>> > > > recover the corrupted blocks.
>> > >
>> > > Is the corruption still present after EXDEV fixes?
>> >
>> > Yes and no.
>> >
>> > Yes, there is corruption when block_cloning is enabled.
>> >
>> > There is no corruption when block_cloning is disabled.
>>
>> I should add some detail to this.
>>
>> The corruption experienced when block cloning is disabled was fixed by:
>>
>> - eb1feadc201a
>> - e2d997d1cbb9
>> - d012836fb616 (specifically this commit)
>> - 20be1b4fc4b7
>>
>> When block_cloning is enabled, the pool is corrupted. This has not been
>> fixed.
>>
>>
>> --
>> Cheers,
>> Cy Schubert <Cy.Schubert@cschubert.com>
>> FreeBSD UNIX:  <cy@FreeBSD.org>   Web:  https://FreeBSD.org
>> NTP:           <cy@nwtime.org>    Web:  https://nwtime.org
>>
>>                         e^(i*pi)+1=0
>>
>>
>>
>>
>
> --
> Nuno Teixeira
> FreeBSD Committer (ports)
>


-- 
Nuno Teixeira
FreeBSD Committer (ports)