kern/79035: gvinum unable to create a striped set of mirrored
sets/plexes
Sven Willenberger
sven at dmv.com
Sat Mar 19 20:50:11 PST 2005
The following reply was made to PR kern/79035; it has been noted by GNATS.
From: Sven Willenberger <sven at dmv.com>
To: "Greg 'groggy' Lehey" <grog at FreeBSD.org>
Cc: freebsd-gnats-submit at FreeBSD.org
Subject: Re: kern/79035: gvinum unable to create a striped set of mirrored
sets/plexes
Date: Sat, 19 Mar 2005 23:43:00 -0500
Greg 'groggy' Lehey presumably uttered the following on 03/19/05 22:11:
> [Format recovered--see http://www.lemis.com/email/email-format.html]
>
> Single line paragraphs. Please limit your lines to < 80 characters.
Sorry, used the web interface ... didn't realize it would not
automatically wrap.
>
> On Sunday, 20 March 2005 at 2:04:34 +0000, Sven Willenberger wrote:
>
>>Under the current implementation of gvinum it is possible to create
>>a mirrored set of striped plexes but not a striped set of mirrored
>>plexes. For purposes of resiliency the latter configuration is
>>preferred as illustrated by the following example:
>>
>>Use 6 disks to create one of 2 different scenarios.
>>
>>1) Using the current abilities of gvinum create 2 striped sets using
>>3 disks each: A1 A2 A3 and B1 B2 B3 then create a mirror of those 2
>>sets such that A(123) mirrors B(123). In this situation if any drive
>>in Set A fails, one still has a working set with Set B. If any drive
>>now fails in Set B, the system is shot.
>
>
> No, this is not correct. The plex ("set") only fails when all drives
> in it fail.
>
I hope the following diagrams better illustrate what I was trying to
point out. Data striped across all the A's and that is mirrored to the B
Stripes:
__stripe__
__|___|____|__
| A1 A2 A3 | --|m
|____________| |i
|r
__stripe__ |r
__|___|____|__ |o
| B1 B2 B3 | --|r
|____________|
If A1 fails, then the A Stripe set cannot function (much like in Raid 0,
one disk fails the set) meaning that B now is the array:
__stripe__
__|___|____|__
| A2 A3 | ==> fails
|____________| |
|
--X--
__stripe__ |
__|___|____|__ |
| B1 B2 B3 | ==> remains
|____________|
If any B disk fails then the B Stripe set is failed leaving no
functioning part of the mirror:
__stripe__
__|___|____|__
| A2 A3 | ==> fails
|____________| |
|
--X--
__stripe__ |
__|___|____|__ |
| B1 B3 | ==> fails
|____________|
Unless I am misunderstanding and gvinum somehow rebuilds the A stripe
over A2 and A3 if A1 fails.
>
>>2) Using the proposed added ability to create 3 mirror sets A1 and
>>B1, A2 and B2, A3 and B3. Now create a stripe set across all three
>>mirrors. Now we can have a situation where one of the "A" drives
>>fail (for example A1). Then we can also have one of the "B" drives
>>fail and, as long as it is not "B1" in this case, we still have a
>>functioning array.
>
>
> Agreed. So there's no difference.
>
_____stripe_____
__|__ __|___ __|___
| A1 | | A2 | | A3 |
| B1 | | B2 | | B3 |
|____| |____| |____|
Now If A1 Fails, the B1 part of the mirror can still participate in the
stripe:
_____stripe____
__|__ __|__ __|__
| | | A2 | | A3 |
| B1 | | B2 | | B3 |
|____| |____| |____|
Likewise if either B2 or B3 fails now we still have a functioning stripe:
_____stripe____
__|__ __|__ __|__
| | | A2 | | A3 |
| B1 | | | | B3 |
|____| |____| |____|
At this point we could still have either A3 or B3 fail and still have a
functioning stripe set.
>
>>Thus the striping of mirrors (rather than a mirror of striped sets)
>>is a more resilient and fault-tolerant setup of a multi-disk array.
>
>
> No, you're misunderstanding the current implementation.
Perhaps I am ... but unless gvinum somehow reconstructs a 3 disk stripe
into a 2 disk stripe in the event one disk fails, I am now sure how. The
resiliency has to do with a 2 disk failure. Even in a 4 disk scenario,
the mirror of stripes can survive 2 of 6 2-disk failure scenarios while
the stripe of mirros can survive 4 of 6 2-disk failure scenarios.
>
> This is a change request, so I'm not closing (or even assigning to
> myself) the PR.
>
Fair enough ... I would just like to see the stripe of mirror scenarios
common to hardware raid solutions become a configuration option for
gvinum (or understand why my interpretation above is incorrect), so per
your original advice I submitted this PR.
Sven Willenberger
More information about the freebsd-bugs
mailing list