zfs drive keeps failing between export and import

Michael Proto mike at jellydonut.org
Thu Jan 22 17:45:05 PST 2009


On Thu, Jan 22, 2009 at 4:24 PM, David Ehrmann <ehrmann at gmail.com> wrote:
> On Fri, Jan 16, 2009 at 3:21 PM, David Ehrmann <ehrmann at gmail.com> wrote:
>> On Fri, Jan 16, 2009 at 3:33 AM, Pete French
>> <petefrench at ticketswitch.com> wrote:
>>>> a software problem before hardware.  Both drives are encrypted geli
>>>> devices.  I tried to reproduce the error with 1GB disk images (vs
>>>
>>> This is probably a silly question, but are you sure that the drives
>>> are not auto detaching ? I had big problems with a zfs mirror on top
>>> of geli which turned out to be that drives mounted using "geli_devices"
>>> in rc.conf will auto detach unless you set "geli_autodetach" to NO.
>>
>> Not silly at all.  I didn't know that could be an issue, but they
>> weren't mounted with "geli_devices," they were mounted by hand with
>> "geli attach /dev/ad<disk>."  I did not set the -d flag on attach, and
>> I don't think I used the -l flag on detach, either.  Listing the
>> device says this:
>>
>> Geom name: ad10.eli
>> EncryptionAlgorithm: AES-CBC
>> KeyLength: 128
>> Crypto: hardware
>> UsedKey: 0
>> Flags: NONE
>>
>> (and more stuff)
>>
>> One more interesting thing: I accidentally rebooted the system without
>> any detaching/exporting (it involved a different, bad drive).  When it
>> came up, I was able to re-import tank without any problems.
>>
>
> Ok, here's where it gets interesting:
>
> The next time I saw the import error, I ran zdb -l on the actual dev.
> It couldn't find the labels.  So I used dd to grab the first 4k of the
> .eli device and the actual device. Once I got it working, I repeated.
> The data in the first 4k of /dev/ad8 were all 0x00 both times.  I'm
> guessing this is reserved, or something.  The data in the first 4k of
> /dev/ad8.eli differed between runs (so zdb -l is probably right about
> not finding the label).
>
> In the /dev/ad8.eli that zfs doesn't recognize, I found a 16 byte
> string that was repeated a lot, but it was also repeated in another
> place: the good /dev/ad10.eli (though the offsets were different).
> The other weird thing: the good and bad /dev/ad8.eli look a lot alike:
> one 16 byte string, then another that gets repeated, then another 16
> byte string randomly shows up at 0x200.
>
> Why the same data appear in the bad ad8.eli as the good ad10.eli, I'm
> not sure (I do have the same password and no keyfile with geli), but
> the patterns of data looking the same make me think something's wrong
> with the encryption.  It's using 128 bit AES-CBC, and these patterns
> would not be hidden by it (128 bits == 16 bytes).
>
> I'm using a Via C7 CPU's padlock cryptographic accelerator, and geli
> reports this.  I'm guessing this is either a padlock or a geli bug.
>
> I can't reliably reproduce this problem, but doing it with padlock off
> might be a good test.
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
>

I saw something similar (minus zfs) when I was playing with padlock
and geli on my C7-Esther fileserver. When trying to mount a geli
partition I'd intermittently get a bad decryption key error. Run the
same command again to mount the partition and it'd work fine. This was
using both password and key-file operations. IIRC when I disabled
padlock acceleration it worked fine in my limited testing. That was
6.4, now that I'm on 7.1 it might be worth looking at again.


-Proto


More information about the freebsd-stable mailing list