ZFS-backed NFS export with vSphere
Zoltan Arnold NAGY
zoltan.arnold.nagy at gmail.com
Thu Jun 27 22:16:49 UTC 2013
Right. As I said, increasing it to 1M increased my throughput from 17MB/s
to 76MB/s.
However, the SSD can do much more random writes; any idea why I don't see
the ZIL go over
this value?
(vSphere always uses sync writes).
Thanks,
Zoltan
On Thu, Jun 27, 2013 at 11:58 PM, Rick Macklem <rmacklem at uoguelph.ca> wrote:
> Zoltan Nagy wrote:
> > Hi list,
> >
> > I'd love to have a ZFS-backed NFS export as my VM datastore, but as
> > much as
> > I'd like to tune
> > it, the performance doesn't even get close to Solaris 11's.
> >
> > I currently have the system set up as this:
> >
> > pool: tank
> > state: ONLINE
> > scan: none requested
> > config:
> >
> > NAME STATE READ WRITE CKSUM
> > tank ONLINE 0 0 0
> > mirror-0 ONLINE 0 0 0
> > da0 ONLINE 0 0 0
> > da1 ONLINE 0 0 0
> > mirror-1 ONLINE 0 0 0
> > da2 ONLINE 0 0 0
> > da3 ONLINE 0 0 0
> > logs
> > ada0p4 ONLINE 0 0 0
> > spares
> > da4 AVAIL
> >
> > ada0 is a samsung 840pro SSD, which I'm using for system+ZIL.
> > daX is 1TB, 7200rpm seagate disks.
> > (from this test's perspective, if I use a separate ZIL device or just
> > a
> > partition, doesn't matter - I get roughly the same numbers).
> >
> > The first thing I noticed is that the FSINFO reply from FreeBSD is
> > advertising untunable values (I did not find them documented either
> > in the
> > manpage, or as a sysctl).
> >
> > rtmax, rtpref, wtmax, wtpref: 64k (fbsd), 1M (solaris)
> > dtpref: 64k (fbsd), 8k (solaris)
> >
> > After manually patching the nfs code (changing NFS_MAXBSIZE to 1M
> > instead
> > of MAXBSIZE) to adversize the same read/write values (didn't touch
> > dtpref),
> > my performance went up from 17MB/s to 76MB/s.
> >
> > Is there a reason NFS_MAXBSIZE is not tunable and/or is it so slow?
> >
> For exporting other file system types (UFS, ...) the buffer cache is
> used and MAXBSIZE is the largest block you can use for the buffer cache.
> Some increase of MAXBSIZE would be nice. (I've tried 128Kb without
> observing
> difficulties and from what I've been told 128Kb is the ZFS block size.)
>
> > Here's my iozone output (which is run on an ext4 partition created on
> > a
> > linux VM which has a disk backed by the NFS exported from the FreeBSD
> > box):
> >
> > Record Size 4096 KB
> > File size set to 2097152 KB
> > Command line used: iozone -b results.xls -r 4m -s 2g -t 6 -i 0 -i
> > 1 -i 2
> > Output is in Kbytes/sec
> > Time Resolution = 0.000001 seconds.
> > Processor cache size set to 1024 Kbytes.
> > Processor cache line size set to 32 bytes.
> > File stride size set to 17 * record size.
> > Throughput test with 6 processes
> > Each process writes a 2097152 Kbyte file in 4096 Kbyte records
> >
> > Children see throughput for 6 initial writers = 76820.31
> > KB/sec
> > Parent sees throughput for 6 initial writers = 74899.44
> > KB/sec
> > Min throughput per process = 12298.62 KB/sec
> > Max throughput per process = 12972.72 KB/sec
> > Avg throughput per process = 12803.38 KB/sec
> > Min xfer = 1990656.00 KB
> >
> > Children see throughput for 6 rewriters = 76030.99 KB/sec
> > Parent sees throughput for 6 rewriters = 75062.91 KB/sec
> > Min throughput per process = 12620.45 KB/sec
> > Max throughput per process = 12762.80 KB/sec
> > Avg throughput per process = 12671.83 KB/sec
> > Min xfer = 2076672.00 KB
> >
> > Children see throughput for 6 readers = 114221.39
> > KB/sec
> > Parent sees throughput for 6 readers = 113942.71 KB/sec
> > Min throughput per process = 18920.14 KB/sec
> > Max throughput per process = 19183.80 KB/sec
> > Avg throughput per process = 19036.90 KB/sec
> > Min xfer = 2068480.00 KB
> >
> > Children see throughput for 6 re-readers = 117018.50 KB/sec
> > Parent sees throughput for 6 re-readers = 116917.01 KB/sec
> > Min throughput per process = 19436.28 KB/sec
> > Max throughput per process = 19590.40 KB/sec
> > Avg throughput per process = 19503.08 KB/sec
> > Min xfer = 2080768.00 KB
> >
> > Children see throughput for 6 random readers = 110072.68
> > KB/sec
> > Parent sees throughput for 6 random readers = 109698.99
> > KB/sec
> > Min throughput per process = 18260.33 KB/sec
> > Max throughput per process = 18442.55 KB/sec
> > Avg throughput per process = 18345.45 KB/sec
> > Min xfer = 2076672.00 KB
> >
> > Children see throughput for 6 random writers = 76389.71
> > KB/sec
> > Parent sees throughput for 6 random writers = 74816.45
> > KB/sec
> > Min throughput per process = 12592.09 KB/sec
> > Max throughput per process = 12843.75 KB/sec
> > Avg throughput per process = 12731.62 KB/sec
> > Min xfer = 2056192.00 KB
> >
> > The other interesting this is that you can notice the system doesn't
> > cache
> > the data file to ram (the box has 32G), so even for re-reads I get
> > miserable numbers. With solaris, the re-reads happen at nearly wire
> > spead.
> >
> > Any ideas what else I could tune? While 76MB/s is much better than
> > the
> > original 17MB I was seeing, it's still far from Solaris's ~220MB/s...
> >
> > Thanks a lot,
> > Zoltan
> > _______________________________________________
> > freebsd-fs at freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
> >
>
More information about the freebsd-fs
mailing list