RFT: ZFS MFC

Alessandro Dellavedova alessandro.dellavedova at ifom-ieo-campus.it
Sun May 17 04:40:53 UTC 2009


On May 16, 2009, at 8:50 AM, Adam McDougall wrote:

> On Fri, May 15, 2009 at 05:02:22PM -0700, Kip Macy wrote:
>
>  I've MFC'd ZFS v13 to RELENG_7 in a work branch. Please test if you  
> can.
>
>  http://svn.freebsd.org/base/user/kmacy/ZFS_MFC/
>
>  The standard disclaimers apply. This has only been lightly tested  
> in a
>  VM. Please do not use it with data you care about at this time.
>
>
>  Thanks,
>  Kip
>
>
> Seems to work for me so far.  I had a zfs send hang part way through  
> and
> with a notable speed difference depending on the direction but this is
> literally the first time I've tried zfs send/recv and the systems are
> setup different so I have no idea if it would have happened anyway.
> Eventually I could probably make these test systems more similar to  
> give a
> fair test, but wanted to mention it so others could check.
>
> Thanks for working on the MFC, I'm excited to see progress there!
> It will play a factor in some upcoming server plans even if the MFC
> doesn't happen for months.
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org 
> "

We're now testing ZFS under FreeBSD (7.2 and CURRENT) quite  
extensively, because our primary goal is to setup a fileserver that  
can scale up to 80TB using high-density, low-cost storage (2TB SATA II  
disks in a 42 bay, 4 unit storage). The fact that ZFS v13 has been  
MFC'ed sounds good to me. I'd like to recommend to test the most basic  
physical stress conditions, eg:

- Pulling out one healthy disk in a raidz2 pool;
- Pulling it back and see if resilvering starts in a reasonable time  
(we experienced consolle locks, access via SSH to the box was still  
possible and fully responsive);
- Pulling out TWO healthy disks in a raidz2 pool;
- Pulling them back one at a time (and two at the same time) and see  
if the pool is resilvered in a reasonable time;
- Too many combinations to be listed here.

I'll be more than happy if the FreeBSD community will came out with a  
sort of ZFS standard stress test suite (both physical and logical), in  
order to be able to compare the ZFS reliability when used in different  
scenarios (eg: our scenario is made of a v20Z server with 6Gb RAM  
directly attached via a 2Gb Qlogic Fibre Channel adapter, the JBOD is  
an Apple Xraid fully loaded with 450Gb PATA disks, all sysctl  
parameters set  accordingly to the ZFS tuning guide).

I'm strongly conviced that, for a number of reasons, FreeBSD + ZFS can  
be the silver bullet in the enterprise storage just because of the  
following:

PROS:

- Solaris/OpenSolaris are limited to an NGROUPS_MAX value of 16 (http://bugs.opensolaris.org/view_bug.do?bug_id=4088757 
, http://www.j3e.de/ngroups.html) while FreeBSD isn't (we are using an  
NGROUPS_MAX value of 64 right now). This is a good point if you are  
serving hundreds of clients, authenticated via LDAP (Posix accounts  
RFC 2307) and you are living in a mixed shop (MACs and PCs.. like we  
do);
- Apple is introducing ZFS support on non-bootable storage under OS X  
10.6 but nobody really knows what ZFS version will be made available;
- License constraints on Linux. ZFS can't be used in kernel mode, only  
using FUSE at the price of degraded performance (AFAIK);
- Maybe I'm missing something but it's ok since I'm slightly drunk as  
of now. Please help (by suggesting missing points, not by suggesting  
me to refrain from drinking on Saturday's night ;-).

CONS:

- FreeBSD is currently lacking an enterprise-level support for modern  
4Gb Fibre Channel HBAs, as you can read here:
http://lists.freebsd.org/pipermail/freebsd-scsi/2009-April/date.html#3876 
  <-- No reply, AFAIK;
http://lists.freebsd.org/pipermail/freebsd-scsi/2008-October/003686.html
http://lists.freebsd.org/pipermail/freebsd-scsi/2008-November/003705.html
http://lists.freebsd.org/pipermail/freebsd-stable/2008-November/046556.html
http://lists.freebsd.org/pipermail/freebsd-stable/2008-November/046602.html

Before throwing rotting tomatoes straight to me please consider the  
following facts:

- Our infrastructure is 90% based on FreeBSD. It has proven to be rock  
solid over a 7 year timespan. Are we happy ? Definitely yes.;
- We are grateful to the FreeBSD community, this mail should be  
intended as a constructive criticism in order to exploit a "sweet  
spot" in storage enterprise. This can dramatically leverage the  
adoption of FreeBSD in the enterprise, if we consider the fact that,  
due to the global crisis, most enterprises will start to look at "low- 
cost, highly reliable, higly scalable storage systems";
- We are also grateful to the Pawel Jakub Dawidek and all the other  
FreeBSD ZFS contributors, you are doing a great job guys, thanks.

What can I do to foster this trend, and being able to further enhance  
the ZFS support under FreeBSD, provided that we cannot contribute  
coding skills ?

Unfortunately not a lot but we can do the following:

- Donate some hardware (Fibre Channel HBAs) to the FreeBSD project  
(paid from my pocket, not my employer's one);
- Donate some money (paid from my employer's pocket, if I can  
demonstrate that this can help us to save big bucks on high-end  
storage systems);
- Detail in a very precise way all the tests that we are doing on ZFS,  
using the following storage: Apple Xraid, Nexsan Sas/SATAbeast, EMC  
(waiting for the final configuration) as a contribution to the FreeBSD  
community.

In a few words, please let the community know what we can do in order  
to make this dream come true.

FreeBSD: The power to serve, on steroids.

Alessandro Dellavedova

European Institute of Oncology
Department of Experimental Oncology
Via Adamello, 16 - 20139 Milan, Italy




More information about the freebsd-stable mailing list