ZFS Advice

Matt Simerson matt at corp.spry.com
Wed Aug 6 07:32:47 UTC 2008


On Aug 5, 2008, at 7:41 PM, George Hartzell wrote:

> Wes Morgan writes:
>> I'm looking for information and advice from those experienced in  
>> building
>> storage arrays with good performance.
>> [...]
>
> I don't have any experimental data to contribute, but there are some
> interesting hardware discussions on the opensolaris ZFS web site and
> blog entries.
>
> It does seem to be true that ZFS does best when used with simple
> controllers, and a lot of the opensolaris community seems to like this
> relatively inexpensive 8-port card (<$100 at newegg), it's apparently
> what SUN ships in their "Thumper":
>
>  http://www.supermicro.com/products/accessories/addon/AoC-SAT2-MV8.cfm
>
> One of the opensolaris threads links off to this this blog entry which
> discusses experiences with PCI-X and PCI-E cards, it might be useful.
>
>  http://jmlittle.blogspot.com/2008/06/recommended-disk-controllers-for-zfs.html
>
> The AoC-SAT2-MV8 is based on the "Marvell Hercules-2 Rev. C0 SATA host
> controller", which seems to be AKA 88SX6081, which is listed as
> supported by the ata driver in 7.0-RELEASE.  Has anyone had any ZFS
> experience with it?

I had 3 of them inside a SuperMicro 24 disk chassis with 16GB RAM, 8  
core Xeon, and 24 1TB disks each. The other 24 disk chassis just like  
it I build with two Areca 1231ML SATA RAID controllers.

I first tested UFS performance with a single disk on FreeBSD (Areca  
and Marvell) and OpenSolaris (Marvell). Write performance heavily  
favored the Areca card with the optional BBWC. The read performance  
was close enough to call even.

Then I tested ZFS with RAIDZ in various configs (raidz, raidz2, 4,6,  
and 8 disk arrays) on FreeBSD. When using raidz and FreeBSD, the  
difference in performance of the controllers is much smaller. It's bad  
with the Areca controller and worse with the Marvell. My overall  
impression is that ZFS performance under FreeBSD is poor.

I say this because I also tested one of the systems with OpenSolaris  
on the Marvell card (OpenSolaris doesn't support the Areca). Read  
performance with ZFS and RAIDZ on Solaris was not just 2-3 but 10-12x  
faster on Solaris. OpenSolaris write performance was about 50% faster  
than FreeBSD on the Areca controller and 100% faster than FreeBSD on  
the Marvell.

The only way I could get decent performance out of FreeBSD and ZFS was  
to use the Areca as a RAID controller and then ZFS stripe the data  
across the two RAID arrays. I haven't tried it but I'm willing to bet  
that if I used UFS and geom_stripe to do the same thing, I'd get  
better performance with UFS. If you are looking for performance, then  
raidz and ZFS is not where you want to be looking.

I use ZFS because these are backup servers and without the file system  
compression, I'd be using 16TB of storage instead of 11.

As far as workload with prefetch: under my workloads (heavy network &  
file system I/O) prefetch=almost instant crash and burn. As soon as I  
put any heavy load on it, it hangs (as I've described previously on  
this list).

Because I need the performance and prefer FreeBSD, the Areca cards  
with BBWC are well worth it. But if you need serious performance on a  
shoestring budget, consider running OpenSolaris with the Marvell cards.

As to whether an 8 or 16 port will perform better, it depends on the  
bus and the cards. As long as you are using them on a PCIe multilane  
bus, you'll likely be hitting your disks I/O limits long before you  
read the bus limits. So it won't matter much.

3ware controllers = cheap and you get what you pay for. At my last job  
we had thousands of 3ware cards deployed because they were so  
inexpensive and RAID = RAID, right?  Well, they were the controllers  
most likely to result in catastrophic data loss for our clients. Maybe  
it's because the interface is confusing the NOC technicians, maybe  
it's because their recovery tools suck, or because when the controller  
fails it hoses the disks in interesting ways. For various reasons, our  
luck at recovering failed RAID arrays on 3ware cards was poor.

I've used a lot of LSI in the past. They work well but they aren't  
performance stars. The Areca is going to be a faster card than the  
others and it comes with a built-in Ethernet jack. Plug that sucker  
into your private network and use a web server to remotely manage the  
card. That's a nice feature. :)

Several weeks after deploying both systems, we took down the AoC based  
one and retrofitted it with another pair of Areca controllers.  
Publishing the benchmarks is on my TODO list.

Matt



More information about the freebsd-fs mailing list