ZfS & GEOM with many odd drive sizes
Doug Rabson
dfr at rabson.org
Wed Jul 25 11:23:15 UTC 2007
On 25 Jul 2007, at 11:13, Mark Powell wrote:
> On Wed, 25 Jul 2007, Doug Rabson wrote:
>
>> I'm not really sure why you are using gmirror, gconcat or gstripe at
>> all. Surely it would be easier to let ZFS manage the mirroring and
>> concatentation. If you do that, ZFS can use its checksums to
>> continually
>> monitor the two sides of your mirrors for consistency and will be
>> able
>> to notice as early as possible when one of the drives goes flakey.
>> For
>> concats, ZFS will also spread redundant copies of metadata (and
>> regular
>> data if you use 'zfs set copies=<N>') across the disks in the
>> compat. If
>> you have to replace one half of a mirror, ZFS has enough
>> information to
>> know exactly which blocks needs to be copied to the new drive
>> which can
>> make recovery much quicker.
>
> gmirror is only going to used for the ufs /boot parition and block
> device swap. (I'll ignore the smallish space used by that below.)
Just to muddy the waters a little - I'm working on ZFS native boot
code at the moment. It probably won't ship with 7.0 but should be
available shortly after.
> I thought gstripe was a solution cos I mentioned in the original
> post that I have the following drives to play with; 1x400GB,
> 3x250GB, 3x200GB.
> If I make a straight zpool with all those drives I get a total
> usable 7x200GB raidz with only an effective 6x200GB=1200GB of
> usable storage. Also a 7 device raidz cries out for being a raidz2?
> That's a further 200GB of storage lost.
> My original plan was (because of the largest drive being a single
> 400GB) was to gconcat (now to gstripe) the smaller drives into 3
> pairs of 250GB+200GB, making three new 450GB devices. This would
> make a zpool of 4 devices i.e. 1x400GB+3x450GB giving effective
> storage of 1200GB. Yes, it's the same as above (as long as raidz2
> is not used there), but I was thinking about future expansion...
> The advantge this approach seems to give is that when drives fail
> each device (which is either a single drive or a gstripe pair) can
> be replaced with a modern larger drive (500GB or 750GB depending on
> what's economical at the time).
> Once that replacement has been performed only 4 times, the zpool
> will increase in size (actually it will increase straight away by
> 4x50GB total if the 400GB drive fails 1st).
> In addition, once a couple of drives in a pair have failed and
> are replaced by a single large drive, there will also be smaller
> 250GB or 200GB drives spare which can be further added to the zpool
> as a zfs mirror.
> The alternative of using a zpool of 7 individual drives means
> that I need to replace many more drives to actually see an increase
> in zpool size.
> Yes, there a large number of combinations here, but it seems that
> the zpool will increase in size sooner this way?
> I believe my reasoning is correct here? Let me know if your
> experience would suggest otherwise.
> Many thanks.
>
Your reasoning sounds fine now that I have the bigger picture in my
head. I don't have a lot of experience here - for my ZFS testing, I
just bought a couple of cheap 300GB drives which I'm using as a
simple mirror. From what I have read, mirrors and raidz2 are roughly
equivalent in 'mean time to data loss' terms with raidz1 quite a bit
less safe due to the extra vulnerability window between a drive
failure and replacement.
More information about the freebsd-fs
mailing list