N-way mirror read speedup in zfsonlinux

Alexander Motin mav at FreeBSD.org
Tue Aug 6 16:01:48 UTC 2013


On 06.08.2013 18:37, Steven Hartland wrote:
>> ----- Original Message ----- From: "Alexander Motin" <mav at FreeBSD.org>
>> On 04.08.2013 21:22, Alexander Motin wrote:
>>> On 04.08.2013 21:18, Steven Hartland wrote:
>>>> Interesting stuff.
>>>>
>>>> I created a little test scenario here today to run this through its
>>>> passes.
>>>>
>>>> Its very basic, running 10 x dd's from 5 * 5GB tests to /dev/null on a
>>>> pool made up of a 4 SSD's and 1 HDD in a mirror:
>>>>
>>>>   pool: tpool
>>>> state: ONLINE
>>>>   scan: resilvered 38.5K in 0h0m with 0 errors on Sun Aug  4 18:13:59
>>>> 2013
>>>> config:
>>>>
>>>>         NAME        STATE     READ WRITE CKSUM
>>>>         tpool       ONLINE       0     0     0
>>>>           mirror-0  ONLINE       0     0     0
>>>>             ada2    ONLINE       0     0     0
>>>>             ada3    ONLINE       0     0     0
>>>>             ada4    ONLINE       0     0     0
>>>>             ada5    ONLINE       0     0     0
>>>>             ada1    ONLINE       0     0     0
>>>>
>>>> The results are quite telling:-
>>>>
>>>> == Without Patch ==
>>>> === SSDs & HD ===
>>>> Read of 51200MB using bs 1048576 took 51 seconds @ 1003 MB/s
>>>> Read of 51200 MB using bs 4096 took 51 seconds @ 1003 MB/s
>>>> Read of 51200 MB using bs 512 took 191 seconds @ 268 MB/s
>>>>
>>>> === SSDs Only ===
>>>> Read of 51200MB using bs 1048576 took 40 seconds @ 1280 MB/s
>>>> Read of 51200MB using bs 4096 took 41 seconds @ 1248 MB/s
>>>> Read of 51200MB using bs 512 took 188 seconds @ 272 MB/s
>>>>
>>>> == With Patch ==
>>>> === SSDs & HD ===
>>>> Read of 51200MB using bs 1048576 took 32 seconds @ 1600 MB/s
>>>> Read of 51200MB using bs 4096 took 31 seconds @ 1651 MB/s
>>>> Read of 51200MB using bs 512 took 184 seconds @ 278 MB/s
>>>>
>>>> === SSDs Only ===
>>>> Read of 51200MB using bs 1048576 took 28 seconds @ 1828 MB/s
>>>> Read of 51200MB using bs 4096 took 29 seconds @ 1765 MB/s
>>>> Read of 51200MB using bs 512 took 185 seconds @ 276 MB/s
>>>>
>>>> Even with only the SSD's the patched version performs
>>>> noticeably better. I suspect this is down to the fact
>>>> the SSD's are various makes so have slightly different IO
>>>> characteristics.
>>>>
>>>> N.B. The bs 512 tests can be mostly discounted as it was CPU
>>>> limited in dd on the 8 core test machine.
>>>
>>> Could you also run test with HDDs only and with different (lower) number
>>> of dd's? SSDs are much more forgiving due to lack of seek time.
>>
>> I couldn't wait and did it myself with 4xHDDs in mirror:
>> Without patch:
>>  1xdd 360MB/s
>>  2xdd 434MB/s
>>  4xdd 448MB/s
>>
>> With patch:
>>  1xdd 167MB/s
>>  2xdd 310MB/s
>>  4xdd 455MB/s
>>
>> So yes, while it helps with multi-threaded read, sequential
>> low-threaded read is heavily harmed. I would not call it a win.
>>
>
> Hmmm, my results for two HDD's disagree with your results:-
>
> == 2 disks - Without Patch (max_pending=10) ==
> Read of 5120MB using bs: 1048576, readers: 1, took 34 seconds @ 150 MB/s
> Read of 10240MB using bs: 1048576, readers: 2, took 71 seconds @ 144 MB/s
> Read of 25600MB using bs: 1048576, readers: 5, took 188 seconds @ 136 MB/s
> Read of 51200MB using bs: 1048576, readers: 10, took 188 seconds @ 272 MB/s
>
> == 2 disks - With Patch (max_pending=10) ==
> Read of 5120MB using bs: 1048576, readers: 1, took 24 seconds @ 213 MB/s
> Read of 10240MB using bs: 1048576, readers: 2, took 59 seconds @ 173 MB/s
> Read of 25600MB using bs: 1048576, readers: 5, took 114 seconds @ 224 MB/s
> Read of 51200MB using bs: 1048576, readers: 10, took 168 seconds @ 304 MB/s
>
> == 2 disks - Without Patch (max_pending=100) ==
> Read of 5120MB using bs: 1048576, readers: 1, took 37 seconds @ 138 MB/s
> Read of 10240MB using bs: 1048576, readers: 2, took 80 seconds @ 128 MB/s
> Read of 25600MB using bs: 1048576, readers: 5, took 206 seconds @ 124 MB/s
> Read of 51200MB using bs: 1048576, readers: 10, took 208 seconds @ 246 MB/s
>
> == 2 disks - With Patch (max_pending=100) ==
> Read of 5120MB using bs: 1048576, readers: 1, took 24 seconds @ 213 MB/s
> Read of 10240MB using bs: 1048576, readers: 2, took 49 seconds @ 208 MB/s
> Read of 25600MB using bs: 1048576, readers: 5, took 113 seconds @ 226 MB/s
> Read of 51200MB using bs: 1048576, readers: 10, took 116 seconds @ 441 MB/s
>
> Visually watching gstat output shows the first disk ada1 @ 50% busy and
> the second disk (ada5) @ 100% most of the time without the patch, where
> as with
> they both take an even amount of the load.
>
> Note: I believe any values over ~240MB/s are due to ARC replay.

Please show your test script so I could try to reproduce it. Experiment 
with more then 2 disks could be made if we are talking about N.

Also aside of practical results it would be good to get logical 
explanation with some answer to my counterarguments.

-- 
Alexander Motin


More information about the zfs-devel mailing list