ZFS-backed NFS export with vSphere
Zoltan Arnold NAGY
zoltan.arnold.nagy at gmail.com
Thu Jun 27 13:13:54 UTC 2013
Hi list,
I'd love to have a ZFS-backed NFS export as my VM datastore, but as much as
I'd like to tune
it, the performance doesn't even get close to Solaris 11's.
I currently have the system set up as this:
pool: tank
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
da0 ONLINE 0 0 0
da1 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
da2 ONLINE 0 0 0
da3 ONLINE 0 0 0
logs
ada0p4 ONLINE 0 0 0
spares
da4 AVAIL
ada0 is a samsung 840pro SSD, which I'm using for system+ZIL.
daX is 1TB, 7200rpm seagate disks.
(from this test's perspective, if I use a separate ZIL device or just a
partition, doesn't matter - I get roughly the same numbers).
The first thing I noticed is that the FSINFO reply from FreeBSD is
advertising untunable values (I did not find them documented either in the
manpage, or as a sysctl).
rtmax, rtpref, wtmax, wtpref: 64k (fbsd), 1M (solaris)
dtpref: 64k (fbsd), 8k (solaris)
After manually patching the nfs code (changing NFS_MAXBSIZE to 1M instead
of MAXBSIZE) to adversize the same read/write values (didn't touch dtpref),
my performance went up from 17MB/s to 76MB/s.
Is there a reason NFS_MAXBSIZE is not tunable and/or is it so slow?
Here's my iozone output (which is run on an ext4 partition created on a
linux VM which has a disk backed by the NFS exported from the FreeBSD box):
Record Size 4096 KB
File size set to 2097152 KB
Command line used: iozone -b results.xls -r 4m -s 2g -t 6 -i 0 -i 1 -i 2
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
Throughput test with 6 processes
Each process writes a 2097152 Kbyte file in 4096 Kbyte records
Children see throughput for 6 initial writers = 76820.31 KB/sec
Parent sees throughput for 6 initial writers = 74899.44 KB/sec
Min throughput per process = 12298.62 KB/sec
Max throughput per process = 12972.72 KB/sec
Avg throughput per process = 12803.38 KB/sec
Min xfer = 1990656.00 KB
Children see throughput for 6 rewriters = 76030.99 KB/sec
Parent sees throughput for 6 rewriters = 75062.91 KB/sec
Min throughput per process = 12620.45 KB/sec
Max throughput per process = 12762.80 KB/sec
Avg throughput per process = 12671.83 KB/sec
Min xfer = 2076672.00 KB
Children see throughput for 6 readers = 114221.39 KB/sec
Parent sees throughput for 6 readers = 113942.71 KB/sec
Min throughput per process = 18920.14 KB/sec
Max throughput per process = 19183.80 KB/sec
Avg throughput per process = 19036.90 KB/sec
Min xfer = 2068480.00 KB
Children see throughput for 6 re-readers = 117018.50 KB/sec
Parent sees throughput for 6 re-readers = 116917.01 KB/sec
Min throughput per process = 19436.28 KB/sec
Max throughput per process = 19590.40 KB/sec
Avg throughput per process = 19503.08 KB/sec
Min xfer = 2080768.00 KB
Children see throughput for 6 random readers = 110072.68 KB/sec
Parent sees throughput for 6 random readers = 109698.99 KB/sec
Min throughput per process = 18260.33 KB/sec
Max throughput per process = 18442.55 KB/sec
Avg throughput per process = 18345.45 KB/sec
Min xfer = 2076672.00 KB
Children see throughput for 6 random writers = 76389.71 KB/sec
Parent sees throughput for 6 random writers = 74816.45 KB/sec
Min throughput per process = 12592.09 KB/sec
Max throughput per process = 12843.75 KB/sec
Avg throughput per process = 12731.62 KB/sec
Min xfer = 2056192.00 KB
The other interesting this is that you can notice the system doesn't cache
the data file to ram (the box has 32G), so even for re-reads I get
miserable numbers. With solaris, the re-reads happen at nearly wire spead.
Any ideas what else I could tune? While 76MB/s is much better than the
original 17MB I was seeing, it's still far from Solaris's ~220MB/s...
Thanks a lot,
Zoltan
More information about the freebsd-fs
mailing list