NFSv3, ZFS, 10GE performance

Rick Macklem rmacklem at uoguelph.ca
Mon Mar 26 21:48:04 UTC 2012


Sven Brandenburg wrote:
> On 03/26/2012 12:37 PM, Ivan Voras wrote:
> > You could try modifying the rsize and wsize NFS options (read
> > mount_nfs(8)), they help with UFS.
> 
> I tried this a few days ago and fiddling rsize alters performance from
> "ok" to "terrible".
> However, you made me revisit this and mount_nfs(8) seems to have a gem
> in its options: readahead.
> This did the trick for me and my (long and sequential) reads.
> While the manpage says its limited to 0-4, the best results were
> achieved with readahead=8 : 1.1GB/s - which is what I had hoped for.
> 
The new NFS client (which is the default in 9) will use the largest
size supported by the server, limited to MAX_BSIZE as default rsize, wsize.
(And the server will allow a rsize/wsize of MAX_BSIZE.)

MAX_BSIZE is 64kb. I'd like to try making that bigger, but haven't gotten
around to it yet. (If you wanted to try bumping MAX_BSIZE to 128Kb on both
client and server and seeing what happens, that might be interesting, since
my understanding is that ZFS uses a 128Kb block size.)

I'd guess that you needed a readahead of 8 to fill the TCP pipe, but I have
no idea what packet transit time you have between client<->server. (In
other words 64Kb * 8 fills the data pipe. Anything less doesn't do so.
My experience for LANs is that a larger block size with smaller readahead
works about as well. For example 128Kb * 4 or 512Kb * 1, if MAX_BSIZE could
be bumped to 512Kb. Solaris10 servers allow 1Mbyte rsize/wsize, if I recall
correctly?)

So, beyond what you've done, all I can suggest is trying bumping MAX_BSIZE
up (but I have no idea if such a system even boots;-).

Have fun with it, rick

> On a tangent: gnu-dd 1GB/s is 10^9 Bytes/s, not 2^30. Yes, I fell for
> it
> at first :)
> The good news is that there was no fiddling on the NFS server side.
> (Apart from MTU increases, PCI settings and more buffers to get TCP
> performance to full tilt in the first place)
> 
> Hopefully, readahead doesn't kill performance for smaller files.. :-)
> 
> regards,
> Sven
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"


More information about the freebsd-fs mailing list