ZFS confusion
Trond Endrestøl
Trond.Endrestol at fagskolen.gjovik.no
Mon Jan 27 09:08:30 UTC 2014
On Sat, 25 Jan 2014 19:12-0000, Kaya Saman wrote:
> Hi,
>
> I'm really confused about something so I hope someone can help me clear the
> fog up....
>
> basically I'm about to setup a ZFS RAIDZ3 pool and having discovered this
> site:
>
> https://calomel.org/zfs_raid_speed_capacity.html
>
> as a reference for disk quantity got totally confused.
Dead link as far as I can tell.
> Though in addition have checked out these sites too:
>
> https://blogs.oracle.com/ahl/entry/triple_parity_raid_z
>
> http://www.zfsbuild.com/2010/06/03/howto-create-raidz2-pool/
>
> http://www.zfsbuild.com/2010/05/26/zfs-raid-levels/
>
> http://www.linux.org/threads/zettabyte-file-system-zfs.4619/
>
>
> Implementing a test ZFS pool on my old FreeBSD 8.3 box using dd derived vdevs
> coupled with reading the man page for zpool found that raidz3 needs a minimum
> of 4 disks to work.
>
> However, according to the above mentioned site for triple parity one should
> use 5 disks in 2+3 format.
>
> My confusion is this: does the 2+3 mean 2 disks in the pool with 3 hot spares
> or does it mean 5 disks in the pool?
No one's answered this, so I'll just give you my 2 cents.
Triple parity means you're using storage capacity equivalent of three
drives for parity alone. If you use five drives in total, this gives
you 2 drives worth of real data and 3 drives worth of parity. In other
words, you should really consider using a lot more drives when using
triple parity, say nine drives.
> As in:
>
> zpool create <pool_name> raidz3 disk1 disk2 disk3 disk4 disk5
No spares are configured. You should consider something like this:
zpool create <pool_name> raidz3 disk1 disk2 disk3 disk4 disk5 spare disk6 disk7 disk8
> In addition to my testing I was looking at ease of expansion... ie. growing
> the pool, so is doing something like this:
>
> zpool create <pool_name> raidz3 disk1 disk2 disk3 disk4
>
> Then when I needed to expand just do:
>
> zpool add <pool_name> raidz3 disk5 disk6 disk7 disk8
You should do some further experimentation and consider the effects of
the zpool attach command.
> which gets:
>
> pool: testpool
> state: ONLINE
> status: The pool is formatted using a legacy on-disk format. The pool can
> still be used, but some features are unavailable.
> action: Upgrade the pool using 'zpool upgrade'. Once this is done, the
> pool will no longer be accessible on software that does not support
> feature
> flags.
> scan: none requested
> config:
>
> NAME STATE READ WRITE CKSUM
> testpool ONLINE 0 0 0
> raidz3-0 ONLINE 0 0 0
> /tmp/disk1 ONLINE 0 0 0
> /tmp/disk2 ONLINE 0 0 0
> /tmp/disk3 ONLINE 0 0 0
> /tmp/disk4 ONLINE 0 0 0
> raidz3-1 ONLINE 0 0 0
> /tmp/disk5 ONLINE 0 0 0
> /tmp/disk6 ONLINE 0 0 0
> /tmp/disk7 ONLINE 0 0 0
> /tmp/disk8 ONLINE 0 0 0
This is an unclever setup. Of the eight drives configured, you'll only
are allowed to use real storage capacity equivalent of two of the
drives. Maybe you're aiming for redundancy rather than storage
capacity.
> ----------
>
> The same as this:
>
> ----------
>
> pool: testpool
> state: ONLINE
> status: The pool is formatted using a legacy on-disk format. The pool can
> still be used, but some features are unavailable.
> action: Upgrade the pool using 'zpool upgrade'. Once this is done, the
> pool will no longer be accessible on software that does not support
> feature
> flags.
> scan: none requested
> config:
>
> NAME STATE READ WRITE CKSUM
> testpool ONLINE 0 0 0
> raidz3-0 ONLINE 0 0 0
> /tmp/disk1 ONLINE 0 0 0
> /tmp/disk2 ONLINE 0 0 0
> /tmp/disk3 ONLINE 0 0 0
> /tmp/disk4 ONLINE 0 0 0
> /tmp/disk5 ONLINE 0 0 0
> /tmp/disk6 ONLINE 0 0 0
> /tmp/disk7 ONLINE 0 0 0
> /tmp/disk8 ONLINE 0 0 0
This setup is a bit more clever, as you'll get the storage capacity
equivalent of fives drives for real data and the storage capacity
equivalent of three drives for parity.
> ?? Of course using the 1st method there is extra meta data involved but not
> too much especially with TB drives.
>
> Having created a zfs filesystem on top of both setups, the fs will grow over
> the 1st scenario to utilize disks 5 through 8 added later; while of course
> with the second setup the filesystem is already created over all 8 disks.
>
>
> In a real situation however, the above would certainly be 5 disks at a time to
> gain the triple parity, with ZIL and L2ARC on SSD's and hot swap spares.
>
>
> The reason am asking the above is that I've got a new enclosure with up to 26
> disk capacity and need to create a stable environment and make best use of the
> space. So another words, maximum redundancy with max capacity allowed per
> method: which would be raidz1..3 and of course raidz3 offers the best
> redundancy but yet has much more capacity then a raid1+0 setup.
I'm a bit unsure as to whether it's better to simply attach new disks
to the existing raidz3 vdev rather than adding entire new raidz3 vdevs
to the pool. Once you do either, there's no going back unless you are
prepared to recreate the entire pool. Maybe someone else can chime in
on this.
> My intention was to grab 5 disks to start with then expand as necessary plus 2
> SSD's for ZIL+L2ARC using (raid0 mirroring and raid1 mirroring consecutively)
> and then 3x hot swap spares and use lz4 compression on the filesystem. With
> FreeBSD 10.0 as base OS... my current 8.3 must be EOL now though on a
> different box so no matter :-)
Spare drives can be added anytime (zpool add spare diskN) and removed
(zpool remove) unless the spare drive is currently engaged by some
pool.
> Hopefully someone can help me understanding the above.
>
>
> Many thanks.
>
>
> Regards,
>
>
> Kaya
--
+-------------------------------+------------------------------------+
| Vennlig hilsen, | Best regards, |
| Trond Endrestøl, | Trond Endrestøl, |
| IT-ansvarlig, | System administrator, |
| Fagskolen Innlandet, | Gjøvik Technical College, Norway, |
| tlf. mob. 952 62 567, | Cellular...: +47 952 62 567, |
| sentralbord 61 14 54 00. | Switchboard: +47 61 14 54 00. |
+-------------------------------+------------------------------------+
More information about the freebsd-questions
mailing list