zfs drive keeps failing between export and import
ehrmann at gmail.com
Thu Jan 22 21:12:07 PST 2009
On Thu, Jan 22, 2009 at 9:08 PM, David Ehrmann <ehrmann at gmail.com> wrote:
> On Thu, Jan 22, 2009 at 5:45 PM, Michael Proto <mike at jellydonut.org> wrote:
>> On Thu, Jan 22, 2009 at 4:24 PM, David Ehrmann <ehrmann at gmail.com> wrote:
>>> On Fri, Jan 16, 2009 at 3:21 PM, David Ehrmann <ehrmann at gmail.com> wrote:
>>>> On Fri, Jan 16, 2009 at 3:33 AM, Pete French
>>>> <petefrench at ticketswitch.com> wrote:
>>>>>> a software problem before hardware. Both drives are encrypted geli
>>>>>> devices. I tried to reproduce the error with 1GB disk images (vs
>>>>> This is probably a silly question, but are you sure that the drives
>>>>> are not auto detaching ? I had big problems with a zfs mirror on top
>>>>> of geli which turned out to be that drives mounted using "geli_devices"
>>>>> in rc.conf will auto detach unless you set "geli_autodetach" to NO.
>>>> Not silly at all. I didn't know that could be an issue, but they
>>>> weren't mounted with "geli_devices," they were mounted by hand with
>>>> "geli attach /dev/ad<disk>." I did not set the -d flag on attach, and
>>>> I don't think I used the -l flag on detach, either. Listing the
>>>> device says this:
>>>> Geom name: ad10.eli
>>>> EncryptionAlgorithm: AES-CBC
>>>> KeyLength: 128
>>>> Crypto: hardware
>>>> UsedKey: 0
>>>> Flags: NONE
>>>> (and more stuff)
>>>> One more interesting thing: I accidentally rebooted the system without
>>>> any detaching/exporting (it involved a different, bad drive). When it
>>>> came up, I was able to re-import tank without any problems.
>>> Ok, here's where it gets interesting:
>>> The next time I saw the import error, I ran zdb -l on the actual dev.
>>> It couldn't find the labels. So I used dd to grab the first 4k of the
>>> .eli device and the actual device. Once I got it working, I repeated.
>>> The data in the first 4k of /dev/ad8 were all 0x00 both times. I'm
>>> guessing this is reserved, or something. The data in the first 4k of
>>> /dev/ad8.eli differed between runs (so zdb -l is probably right about
>>> not finding the label).
>>> In the /dev/ad8.eli that zfs doesn't recognize, I found a 16 byte
>>> string that was repeated a lot, but it was also repeated in another
>>> place: the good /dev/ad10.eli (though the offsets were different).
>>> The other weird thing: the good and bad /dev/ad8.eli look a lot alike:
>>> one 16 byte string, then another that gets repeated, then another 16
>>> byte string randomly shows up at 0x200.
>>> Why the same data appear in the bad ad8.eli as the good ad10.eli, I'm
>>> not sure (I do have the same password and no keyfile with geli), but
>>> the patterns of data looking the same make me think something's wrong
>>> with the encryption. It's using 128 bit AES-CBC, and these patterns
>>> would not be hidden by it (128 bits == 16 bytes).
>>> I'm using a Via C7 CPU's padlock cryptographic accelerator, and geli
>>> reports this. I'm guessing this is either a padlock or a geli bug.
>>> I can't reliably reproduce this problem, but doing it with padlock off
>>> might be a good test.
>>> freebsd-stable at freebsd.org mailing list
>>> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
>> I saw something similar (minus zfs) when I was playing with padlock
>> and geli on my C7-Esther fileserver. When trying to mount a geli
>> partition I'd intermittently get a bad decryption key error. Run the
>> same command again to mount the partition and it'd work fine. This was
>> using both password and key-file operations. IIRC when I disabled
>> padlock acceleration it worked fine in my limited testing. That was
>> 6.4, now that I'm on 7.1 it might be worth looking at again.
> I just got around to trying it without padlock. I tried to replicate
> the problem 5 or 6 times, but no luck.
> This is 7.1.
> It *sounds* like a padlock problem, but I'd like to see it make the
> same mistake with a file or memory backed md device. Anyway, that
> this point, I can pretty much rule out zfs as the culprit.
Or geli... Any success (not intermittent) reports with a hifn or
broadcom accelerator and geli?
More information about the freebsd-stable