geom_eli related panic on sparc64

Alex alexb at netservicesplc.com
Sat Apr 29 18:04:43 UTC 2006


Hi,

I've been playing around with geom_eli for disk encryption as it can use
hardware acceleration cards such as the soekris 1401 to speed things up.
However, whenever I run "geli attach" to attach a geli device, I get a
kernel panic. The device in question is one slice of a geom_mirror RAID
array -- I have not tested an individual device to determine whether this
makes any difference.

System details:
Sun Netra t1, UltraSparc IIi 440MHz, 512Mb RAM.
FreeBSD 6.0-RELEASE
2x18.2Gb SCSI disks using geom_mirror for RAID1
(I can post a full dmesg if required)

I have done a little investigation into this problem after being pointed
at some kernel debugging instructions in the FreeBSD developers handbook.
The problem occurs both with the stock GENERIC kernel an a self-compiled
kernel.

I recompiled a debugging kernel and obtained the following stack trace
from DDB after the panic occured. The g_post_event_x() line is due to
enabling kern.geom.eli.debug="255" in /boot/loader.conf.

# geli attach -k ~/home.key /dev/mirror/gm0f
Enter passphrase:
g_post_event_x(0xc010a3c0, 0xfffff80000518e00, 2, 262144)
panic: trap: memory address not aligned
KDB: enter: panic
[thread pid 2 tid 100004 ]
Stopped at      kdb_enter+0x3c: ta              %xcc, 1
db> trace
Tracing pid 2 tid 100004 td 0xfffff8002fb8f400
panic() at panic+0xf0
trap() at trap+0x3f0
-- memory address not aligned sfar=0xd18ed31d sfsr=0x4002d %o7=0xc016ac14 --
Encode() at Encode+0x64
g_eli_read_metadata() at g_eli_read_metadata+0x38c
g_eli_ctl_attach() at g_eli_ctl_attach+0x144
g_eli_config() at g_eli_config+0x94
g_ctl_req() at g_ctl_req+0x8c
one_event() at one_event+0x150
g_run_events() at g_run_events+0x4
g_event_procbody() at g_event_procbody+0x70
fork_exit() at fork_exit+0x94
fork_trampoline() at fork_trampoline+0x8

I looked through the kernel code and found Encode() to be in kern/md5c.c.

There are a few gaps in the stack trace due to inlined code, from what
I've been able to piece together, the missing steps are:

Encode() in kern/md5c.c
MD5Final() in kern/md5c.c
eli_metadata_decode_v0() in geom/eli/g_eli.h
eli_metadata_decode() in geom/eli/g_eli.h
g_eli_read_metadata() in geom/eli/g_eli.c

I have not tested either 6-STABLE or 7-CURRENT to see if this problem
still exists, so if it has been noticed and fixed already, I apologise in
advance. I'm afraid I have no useful experience to fix this problem
myself, but I am quite happy to provide any information that can aid in
tracking down the issue.

I am going to disable the geom_mirror array temporarily so that I can
confirm that the problem still occurs when using the scsi devices
directly, and so that I can capture a full kernel dump to help with the
debugging process.

Incidentally, while investigating this, I noticed that the
g_mirror_taste() function is causing READ_BIG errors on the ata cdrom
drive (acd0) in the machine.

Thanks in advance for any help!

Alex


More information about the freebsd-geom mailing list