N-way mirror read speedup in zfsonlinux

Matthew Ahrens mahrens at delphix.com
Wed Aug 14 23:02:37 UTC 2013


This patch looks reasonable to me.  I shared Alexander's skepticism of the
time-based metric.  Just a couple of specific notes:

I believe that mm_preferred is only used for reads.  You might not bother
with the load balancing logic if this is not a read.

You might add accessor functions for avl_numnodes(&vq->vq_pending_tree)
and vq_lastoffset, rather than having vdev_disk.c reach into the queue
implementation.  It looks like technically you could get away without
vq_lock for avl_numnodes(), as it's just loading a "long".  But that is
really relying on the implementation details.  More significantly, you have
no locking for vq_lastoffset.  At a minimum, we should have some comments
explaining that this is OK.  It seems like it could get the wrong result on
32-bit systems (where load/store of 64 bit values is not atomic), but I
guess that's OK since it will only cause a tiny performance impact.

zfs_mirror_by_load -- could this be removed in favor of setting
zfs_vdev_mirror_locality_bonus = 0 to turn off the bonus?  If not, seems
like it should at least be a boolean_t.

vdev_mirror_load() -- if the vdev is not rotating, you always apply the
locality bonus.  It was slightly difficult to follow the comment and logic
around this.  Would a "seek penalty" which is applied if (rotating &&
lastoffset != offset) be easier to understand?

--matt


On Wed, Aug 14, 2013 at 6:50 AM, Steven Hartland <killing at multiplay.co.uk>wrote:

> So based on mav's comments I've created a new version of the load balanced
> patch which takes into account the last IO's locality on the disk.
>
> I've added detection of non-rotating media, that also triggers the locality
> bonus, which significantly improves mixed HDD and SSD mirrors.
>
> I've also removed the time based switching as this was flawed for N-Way
> mirrors where N != 2, and didn't see to provide any benefit either.
>
> One question I did have; is it necessary to use protect the call to
> avl_numnodes with the vq_lock mutex, as this seems like it may not
> actually be needed given its not iterating or modifying the avl?
>
> My tests show this version provides a significant performance increases
> on both from the original strip balancing as well as the zfsonlinux load
> balancing version; resulting in up to 3 x the read rates on an 3 way
> mirror with 2 x HDD's and 1 x SSD.
>
> The patch is attached, comments?
>
> Here is my full set of test results:
>
> == Setup ==
> 3 Way Mirror with 2 x HD's and 1 x SSD
>
> === Prefetch Disabled ==
> ==== Load Balanced (locality) ====
> Read 15360MB using bs: 1048576, readers: 3, took 54 seconds @ 284 MB/s
> Read 30720MB using bs: 1048576, readers: 6, took 77 seconds @ 398 MB/s
> Read 46080MB using bs: 1048576, readers: 9, took 89 seconds @ 517 MB/s
>
> ==== Stripe Balanced (default) ====
> Read 15360MB using bs: 1048576, readers: 3, took 161 seconds @ 95 MB/s
>
> ==== Load Balanced (zfslinux) ====
> Read 15360MB using bs: 1048576, readers: 3, took 297 seconds @ 51 MB/s
>
> === Prefetch Enabled ===
> ==== Load Balanced (locality) ====
> Read 15360MB using bs: 1048576, readers: 3, took 48 seconds @ 320 MB/s
>
> ==== Stripe Balanced (default) ====
> Read 15360MB using bs: 1048576, readers: 3, took 91 seconds @ 168 MB/s
>
> ==== Load Balanced (zfslinux) ====
> Read 15360MB using bs: 1048576, readers: 3, took 108 seconds @ 142 MB/s
>
> == Setup ==
> 2 Way Mirror with 2 x HD's
>
> === Prefetch Disabled ==
> ==== Load Balanced (locality) ====
> Read 10240MB using bs: 1048576, readers: 2, took 131 seconds @ 78 MB/s
>
> ==== Stripe Balanced (default) ====
> Read 10240MB using bs: 1048576, readers: 2, took 160 seconds @ 64 MB/s
>
> ==== Load Balanced (zfslinux) ====
> Read 10240MB using bs: 1048576, readers: 2, took 207 seconds @ 49 MB/s
>
> === Prefetch Enabled ===
> ==== Load Balanced (locality) ====
> Read 10240MB using bs: 1048576, readers: 2, took 85 seconds @ 120 MB/s
>
> ==== Stripe Balanced (default) ====
> Read 10240MB using bs: 1048576, readers: 2, took 109 seconds @ 93 MB/s
>
> ==== Load Balanced (zfslinux) ====
> Read 10240MB using bs: 1048576, readers: 2, took 94 seconds @ 108 MB/s
>
>
>    Regards
>    Steve
>
> ==============================**==================
> This e.mail is private and confidential between Multiplay (UK) Ltd. and
> the person or entity to whom it is addressed. In the event of misdirection,
> the recipient is prohibited from using, copying, printing or otherwise
> disseminating it or any information contained in it.
> In the event of misdirection, illegible or incomplete transmission please
> telephone +44 845 868 1337
> or return the E.mail to postmaster at multiplay.co.uk.
>
> _______________________________________________
> zfs-devel at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/zfs-devel
> To unsubscribe, send any mail to "zfs-devel-unsubscribe at freebsd.org"
>


More information about the zfs-devel mailing list