RAID issues

Brian Kraemer brian at etchings.com
Sun Feb 26 18:08:13 PST 2006


I'm not subscribed to freebsd-questions, so please Cc me on any responses.

Hello,

I'm in the process of building a new server based on Supermicro's
SuperServer 5014C-T platform. It uses the Intel ICH6 SATA controllers and
supports RAID0 and RAID1 via Intel MatrixRAID.

I have the BIOS set up for RAID1 and usually FreeBSD detects this and
everything is fine. My problem is that on occasion, on a reboot, one of
the drives (usually the second one, ata3 on atapci1) is not detected at
all by FreeBSD. The BIOS continues to detect both drives but FreeBSD does
not.

When this happens, FreeBSD notes that the RAID is in a degraded state. I
can use atacontrol to detach and reattach ata3 which usually finds the
drive but the damage has been done. What I mean by damage is this: On the
next reboot, I have massive filesystem errors, even after a full fsck.
These errors are usually related to soft-updates.

These errors are so bad that the kernel will panic as soon as a file is
accessed in the bad partition. The only workaround I have found so far is
to boot into single user mode and run newfs on the partition(s) that are
causing the kernel panic. Obviously this is a less than ideal solution.

My question is this. Does this sound like bad hardware, or a software
problem? I thought at first that it might be a bad hard drive but I ran
some diagnostic software on them and they both came up clean. Has anyone
else experienced this?

Perhaps turning off soft-updates is the answer?

Here's some dmesg output (when things are working properly):

atapci0: <Intel ICH6 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xf000-0xf00f at device 31.1 on pci0
ata0: <ATA channel 0> on atapci0
ata1: <ATA channel 1> on atapci0
atapci1: <Intel ICH6 SATA150 controller> port 0xe900-0xe907,0xea00-0xea03,0xeb00-0xeb07,0xec00-0xec03,0xed00-0xed0f mem 0xd03c3000-0xd03c33ff irq 19 at device 31.2 on pci0
ata2: <ATA channel 0> on atapci1
ata3: <ATA channel 1> on atapci1

acd0: CDROM <CD-224E-N/1.AA> at ata0-master UDMA33
ad4: 381554MB <WDC WD4000YR-01PLB0 01.06A01> at ata2-master SATA150
ad6: 381554MB <WDC WD4000YR-01PLB0 01.06A01> at ata3-master SATA150
ar0: 381553MB <Intel MatrixRAID RAID1> status: READY
ar0: disk0 READY (master) using ad4 at ata2-master
ar0: disk1 READY (mirror) using ad6 at ata3-master


I still have the latest vmcore dump if that will help.

-Brian


More information about the freebsd-questions mailing list