vdev/pool math with combined raidzX vdevs...

Jason Usher jusher71 at yahoo.com
Mon Jul 9 19:58:20 UTC 2012


Hello again,


--- On Fri, 7/6/12, Zaphod Beeblebrox <zbeeble at gmail.com> wrote:


> ... so, again with simplistic assumptions,
> 
> p(36drz3 --- 12 drives, 3 groups) = p(12drz3) * 3
> 
> A "vanilla" RAID-Z2 (if I make an assumption to what you're
> saying) is:
> 
> p(36drz2) = 36 * p(f) * 35 * p(f)
> 
> ... but I can't directly answer you question without knowing
> a) the
> structure of the RAID-Z2 array and p(f).  If we use a
> 1% figure for
> p(f), then P(36drz3,12,3) = 0.035% and p(36drz2) = 4.3%


(snip)


> Put simply, you add the probabilities of things where any
> can cause
> the failure (either drive of R0 failing, any one of the 3
> plexes of a
> complex array failing) and you multiply things where all
> must fail to
> produce failure.


Ok.  So let's start with those numbers from that hardforum link I posted:

(probability of data loss during a rebuild)

RAID-10: 
F = 5%

RAID-Z1:
1 - (1 - F)^(9 - 1) = 33.7%
F= 33.7%

RAID-Z2:
1 - (1 - F)^(10 - 1) - (10 - 1) F (1 - F)^(10 - 2) = 7.1%
F=7.1%

RAID-Z3:
1 - (1 - F)^(11 - 1) - (11 - 1) F (1 - F)^(11 - 2) - (11 - 1)(11 - 2) F^2 (1 - F)^(11 - 3) / 2
F = 1.15%

Again, doesn't really matter what F is, since we are only interested in the comparison...

From what you said, above, striping 3 different raidz3 arrays together into one pool is ADDITIVE ... so the 1.15% rises to 3.45%.

Yes ?

So we triple our risk by running all three raidz3 arrays in one pool, but we still have less than half the risk of a single raidz2 vdev (with no striping) which is 7.1%.

Am I on the right track here ?  I think I'm missing something because with one raidz3, I have a 1.15% chance of "losing a drive during rebuild" but I am thinking about competely healthy arrays who have a larger chance of blowing up because ONE OF THE OTHER vdevs blows four drives simultaneously.  

So I am really comparing 0% probability (if they aren't combined in a zpool, I can take one vdev out and run over it with a train and the other vdev is unharmed) with X% probability, because now something happening in the other vdev can ruin the healthy one...

Am I really the only person worrying about the interactive failure properties of combining vdevs into a pool ?


More information about the freebsd-fs mailing list