GELI + Zpool Scrub Results in GELI Device Destruction (and Later a Corrupt Pool)

Michael B. Eichorn ike at michaeleichorn.com
Mon Apr 25 12:27:23 UTC 2016


On Mon, 2016-04-25 at 10:11 +0200, Fabian Keil wrote:
> "Michael B. Eichorn" <ike at michaeleichorn.com> wrote:
> 
> > 
> > I just ran into something rather unexpected. I have a pool
> > consisting
> > of a mirrored pair of geli encrypted partitions on WD Red 3TB
> > disks.
> > 
> > The machine is running 10.3-RELEASE, the root zpool was setup with
> > GELI
> > encryption from the installer, the pool that is acting up was setup
> > per
> > the handbook.
> [...]
> > 
> > I had just noticed that I had failed to enable the zpool scrub
> > periodic
> > on this machine. So I began to run zpool scrub by hand. It
> > succeeded
> > for the root pool which is also geli encrypted, but when I ran it
> > against my primary data pool I encountered:
> > 
> > Apr 24 23:18:23 terra kernel: GEOM_ELI: Device ada3p1.eli
> > destroyed.
> > Apr 24 23:18:23 terra kernel: GEOM_ELI: Detached ada3p1.eli on last
> > close.
> > Apr 24 23:18:23 terra kernel: GEOM_ELI: Device ada2p1.eli
> > destroyed.
> > Apr 24 23:18:23 terra kernel: GEOM_ELI: Detached ada2p1.eli on last
> > close.
> Did you attach the devices using geli's -d (auto-detach) flag?
> 
> 


I am using whatever the default setup comes out of the rc.d scripts.
My rc.conf was:

geli_devices="ada2p1 ada3p1"
geli_default_flags="-k /root/encryption.key"
zfs_enable="YES"

I will try adding geli_autodetach="NO" and scubbing in about 9 hours.


> If yes, this is a known issue:
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=117158
> 

Reading that bug in detail it appears to be *specifically* for the
kernel panic and that zfs closing and reopening providers is expected
behavior, and that if geli has autodetach configured that it would
detach.

It stikes me that even though this is expected behavior it should not
be. Is there a way we could prevent the detach when zfs does closes and
reopens providers? I cannnot think of a case where the desired behavior
is for the pool to detach when zfs wants to reopen it.

> > 
> > And the scrub failed to initialize (command never returned to the
> > shell).
> This could be the result of another known bug:
> https://lists.freebsd.org/pipermail/freebsd-current/2015-October/0579
> 88.html
> 
> > 
> > I immediately rebooted and both disks came back and resilvered,
> > with
> > permanent metadata errors
> Did those errors appear while resilvering or could they have been
> already present before?

I do not think they were presesent before the disks flip-flopped. There
was no error before my attempt to resliver.

I would expect metadata errors as I effectively had:
disk 1 online disk 2 offline
then immediately without a resliver:
disk 1 offline disk 1 online

> Fabian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5729 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20160425/0126ffc6/attachment.bin>


More information about the freebsd-fs mailing list