[Bug 266014] panic: corrupted zfs dataset (zfs issue)
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 25 Oct 2022 05:40:21 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=266014
Duncan <dpy@pobox.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|panic: on long running find |panic: corrupted zfs
|(zfs issue) |dataset (zfs issue)
--- Comment #5 from Duncan <dpy@pobox.com> ---
I got back to trying to move forward with this issue (re-enabling full EOD
runs) and found out where the problem was.
In my nextcloud jail, part of the /usr/src file system would cause a panic if
accessed (i.e. running a find over it). I haven't gotten around to locating
the exact directory/file.
Now the interesting thing is that this dataset is encrypted and would mount
when decrypted (using a key from higher up the filesytem hierarchy (typed in
password as part of startup)). The panic would only occur on access to parts
of the filesytem dataset.
I tried replicating the dataset (to keep for later diagnosis), but upon
mounting, machine would panic, requiring a boot into single user mode and
deleting the copied dataset (probably should just modify "canmount"), before
booting would complete without a panic.
My backups(?) consisted of dataset replication onto other pools (in the same
machine and to another (soon to be offsite machine (running truenas)). When I
entered the key and mounting occurred, both other systems would panic.
My only solution (I could think of), was to create a new dataset and copy over
(using rsync in this case) all the folders except /usr/src. I copied /usr/src
from another jail.
I have renamed and kept the original dataset for potential debugging in the
future.
Moral of the story: Proof that ZFS replication is actually NOT the same as a
backup. The corruption was propagated in a more virulent form (mount == panic)
to the replicated dataset.
At some time I would appreciate being able to help someone figure out what has
happened to the dataset, and how to stop similar in the future. It has shaken
my faith a little (in ZFS).
--
You are receiving this mail because:
You are the assignee for the bug.