Issues with GEOM and raid5?
elshar at cheekan.org
Sun Oct 16 19:25:27 PDT 2005
I seem to be having a problem with my raid5 gvinum array actually
causing my server to freeze and/or kernel panic.
I've got a dualcore opteron running 6rc1 (rebuilt world/kernel as of
yesterday), with an lsi megaraid 300-8x with 8 drives attached. It's
currently setup with 4 hardware raid1 arrays. The drives FBSD sees are
actually something like amrd0-4. Just to clarify: 8 drives in 4 raid1
arrays in hardware being used in a 4 logical drive raid5 gvinum array.
I got gvinum working, the array shows up on boot, and everything's fine.
But it seems that after I do a *lot* of writes to the drive interesting
things start happening. There might even be symptoms going on during
these transfers, as I've noticed the transfers stop occasionally for
anywhere up to about 10 seconds or so.
The first time it did this I got a message about increasing the
PMAP_SHPGPERPROC. It also actually caused the raid card itself to think
there was something wrong with two of the eight drives. I had to hotplug
them while in the raid card's bios to get it to accept that the drives
were not dead and allow a rebuild of the two affected raid1 arrays.
Tonight before the crash I got through writing approximately 45,000
files to it in a total of about 310GB using dd. All those went fine, but
then any process that tried to read anything from the array started to
become non-responsive, and then the machine froze. Unfortunately, I
won't be able to physically get to it until tomorrow afternoon, but I
was hoping maybe someone might give me some things to do to try to coax
more information out of what's going on.
I haven't tried increasing the PMAP value. It seemed to me that it would
only hide the actual issue that seems to be going on. And as a side note
the first time the array did this I hadn't yet recompiled 6rc1 for SMP,
so it was still running in UP mode. It was actually doing the buildworld
for SMP when it decided to die.
It is also a fresh install of 6rc1 + whatever commits were made as of
approximately friday or so.
Any suggestions? Things I should look for? I'll reply back tomorrow with
(hopefully) what it was complaining about and whatever debug info comes
out of suggestions to this email.
More information about the freebsd-geom