ZFS I/O Throughput question..

Bernd Walter ticso at cicely7.cicely.de
Wed Sep 15 08:45:18 UTC 2010


On Wed, Sep 15, 2010 at 03:05:46AM -0500, Chris Watson wrote:
> I have been testing ZFS on a home box now for a few days and I have a  
> question that is perplexing me. Everything I have read on ZFS says in  
> almost every case mirroring is faster than raidz. So I initially setup  
> a 2x2 Raid 10 striped mirror. Like so:
> 
> priyanka# zpool status
>   pool: tank
>  state: ONLINE
>  scrub: none requested
> config:
> 
> 	NAME        STATE     READ WRITE CKSUM
> 	tank        ONLINE       0     0     0
> 	  mirror    ONLINE       0     0     0
> 	    ada2    ONLINE       0     0     0
> 	    ada3    ONLINE       0     0     0
> 	  mirror    ONLINE       0     0     0
> 	    ada4    ONLINE       0     0     0
> 	    ada5    ONLINE       0     0     0
> 
> errors: No known data errors
> priyanka#
> 
> With this configuration I am getting the following throughput for reads:
> 
> priyanka# dd if=/dev/zero of=/tank/Aperture/test01 bs=1m count=10000
> 10000+0 records in
> 10000+0 records out
> 10485760000 bytes transferred in 98.533820 secs (106417878 bytes/sec)
> priyanka#
> 
> And for reads:
> 
> priyanka# dd if=/tank/Aperture/test01 of=/dev/null bs=1m
> 10000+0 records in
> 10000+0 records out
> 10485760000 bytes transferred in 50.309988 secs (208423027 bytes/sec)
> priyanka#
> 
> So basically 100MB/writes, 200MB/reads.

Not surprising - two disks in parallel are used to write data.
Probably it might have been layed out over the stripe set, so that
actually twice the number of disks could have been used, but this
optimization for single linear file access is bad for random performance,
since you need to seek all drives.

> I thought the disks I have would do a little better than that assuming  
> from much of the zfs literature proclaiming mirroring to be fastest  
> with more I/O and more OPS/sec. Well I decided to blow away the mirror  
> and instead do a 4 disk raidz to see just how much faster mirroring  
> was with ZFS vs raidz. This is where I was blown away and more than a  
> little confused.
> 
> priyanka# zpool status
>   pool: tank
>  state: ONLINE
>  scrub: none requested
> config:
> 
> 	NAME        STATE     READ WRITE CKSUM
> 	tank        ONLINE       0     0     0
> 	  raidz1    ONLINE       0     0     0
> 	    ada2    ONLINE       0     0     0
> 	    ada3    ONLINE       0     0     0
> 	    ada4    ONLINE       0     0     0
> 	    ada5    ONLINE       0     0     0
> 
> errors: No known data errors
> priyanka#
> 
> Write performance:
> 
> priyanka# dd if=/dev/zero of=/tank/test.001 bs=1m count=10000
> 10000+0 records in
> 10000+0 records out
> 10485760000 bytes transferred in 34.310930 secs (305609903 bytes/sec)
> priyanka#

You basicly have 3 drives to write too - the parity disk writes
redundand data, so it doesn't add to the bandwidth.

> Read performance:
> 
> priyanka# dd if=/tank/test.001 of=/dev/null bs=1m count=10000
> 10000+0 records in
> 10000+0 records out
> 10485760000 bytes transferred in 31.463025 secs (333272467 bytes/sec)
> priyanka#

Now you have 4 drives to read from.
The problem however is that you seek all four drives.
But you get the same pessimisation for random access as if your mirror
would have been used spreading data over all disks.
The only difference is that with a single raidz you don't have a choice
anymore.

> Say whaaaaaat?! Perhaps I am completely misunderstanding every zfs  
> admin guide, FAQ and paper on ZFS. But everything I have read says  
> mirroring should be much faster than a raidz and should almost always  
> be preferred. Which clearly from above is not the case. The only thing  
> I can think of is that the dd "benchmark" is not accurate because it  
> is writing data sequentially? Which is the place raidz has an edge  
> over mirroring, again from what I have read. But the above is not so  
> much an 'edge' in performance as much as a complete and total data  
> rape. So my question is, is everything i've read about ZFS and  
> mirroring vs raidz wrong? Is the benchmark horribly flawed? Is raidz  
> actually faster versus mirroring? Does FreeBSD perform some kind of  
> voodoo h0h0magic that makes raidz perform much better than mirroring  
> in ZFS than other platforms? Or am I just having a really weird dream  
> and none of this is real.

That's exactly the point - your dd benchmark only tests a very specific
case, whichin fact might match your application, but in almost every
use case you access multiple files at the same time and then it is
good to seek drives independly.
Just repeat the same test with two files written/read at the same time
and you should easily see a major difference.
You should also note that all the cases where linear reads are faster
than a single drive only works because of very agressive prereading.
The faster your drives are and the more drives you have the prereading
must be more agressive to still get a win - in the 4 disk raidz read
case you already seem to have reached some kind of limitation.

-- 
B.Walter <bernd at bwct.de> http://www.bwct.de
Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm.


More information about the freebsd-fs mailing list