Software RAID1 with atacontrol
maillist+fc at bolet.org
Mon Sep 27 02:46:18 PDT 2004
I am trying to use the RAID features provided by the ata driver in 5.3.
My machine uses 5.3-BETA5, cvsupped on September 24th, 2004. Kernel is
almost identical to GENERIC: I just deactivated I486_CPU and I586_CPU
(machine is an Athlon XP), SMP, and replaced SCHED_4BSD with SCHED_ULE.
As far as I know:
-- some (cheap) RAID extension cards are just ATA controllers with a
special boot ROM which understands the metadata stored on the disks and
describing how the disks are grouped in RAID arrays;
-- FreeBSD understands the metadata format for some models;
-- FreeBSD can use also an arbitrary ATA controller with the same
code, thus providing software RAID (using the same metadata format
than Promise RAID cards). Since an ordinary ATA controller does not
have the special boot ROM, booting on such an array is subject to some
restrictions, but once the kernel is up, it should make no difference
I have tried to set up a simple RAID1 (mirror) array. I have plugged two
identical 40GB IBM disks in my machine (respectively ad1 and ad3), on
the motherboard ATA controllers (with no RAID boot ROM). Then, I created
the array with:
atacontrol create RAID1 ad1 ad3
This produced ar0, on which I could create a slice table, then a
disklabel in the first slice, then a filesystem in /dev/ar0s1d, which
I could mount. So far so good.
I then tried to simulate a crash by unplugging one disk (with the
machine swtiched off). On reboot, the mirror could still be accessed
under the name ar0s1d, and atacontrol reported one disk as READY and
the other as DOWN. This is fine. I thus made some write accesses on the
filesystem, switched off the machine, plugged back the missing disk and
rebooted. There, things have gone amiss: the code did not detect that
both disks were not synchronized, and happily mounted the filesystem.
Bad errors then happend upon reading the disk ("ls" reporting "invalid
file descriptor", and so on). I detached one disk (with "atacontrol
detach") and attached it again, and then added it with "addspare"; then
I rebuilt the array ("atacontrol rebuild") and all was fine again.
I then did a second test: I launched a process which wrote on the
mirror filesystem ("dd if=/dev/random of=foo") and, which the system
was writing on the disk, I hit the reset switch (thus simulating a
power failure). Since the two disk writes cannot be guaranteed as
simultaneous, the two disks cannot be synchronized. But, upon reboot,
the system considered both disks to be READY.
Therefore I think that there is something fishy here. Since the same
code is used for Promise RAID cards, the support for those card may be
"broken" as well (at least unreliable in case of a crash, which is a
problem since RAID1 is meant to add reliability in case of a crash).
What can I do to help debug this ?
As a side note: I get the same read and write throughput on the mirror
(about 20 MB/s for reading or writing -- this is what each disk can do
alone). I expected a doubled throughput for reading (the disks are on
two distinct ATA controllers). Maybe this is a symptom of something
wrong elsewhere ?
More information about the freebsd-current