Linux NFS client and FreeBSD server strangeness

Fri Apr 6 00:44:37 UTC 2018

Bruce Evans wrote
>On Wed, 4 Apr 2018, Kaya Saman wrote:
>> If I recall correctly the "sync" option is default. Though it might be
>> different depending on the Linux distro in use?
>>
>> I use this: vers=3,defaults,auto,tcp,rsize=8192,wsize=8192
>>
>> though I could get rid of the tcp as that's also a default. The rsize
>> and wsize options are for running Jumbo Frames ie. larger MTU then
>> 1500; in my case 9000 for 1Gbps links.
>
>These rsize and wsize options are pessimizations.  They override the
>default sizes which are usually much larger for tcp.
Yes, for TCP, the FreeBSD client uses the largest size supported by the
server, up to 128K (because MAXPHYS is set to that and, as such, that
is the largest size safely supported by the buffer cache.

I chose to make it this large by default for a couple of reasons:
1 - Solaris used 256K by default (and a maximum of 1Mbyte) back when it
      was Sun and their engineers were pretty good at this stuff.
     (I believe they argued that fewer RPCs implied lower server load for a
      given # of bytes. Usually the NFS engineering types have been concerned
      with server load and, therefore, the server's capacity and not the performance
      of a single client doing a single file write.)
2 - I don't do ZFS, but some thought that 128K would be a better I/O read/write
     size for ZFS.
Personally, since all I have for testing is 100Mbits/sec networking, I always
get "wire speed" and don't see any difference for different rsize/wsize over
TCP, so long as it is at least 16K.

One case where large rsize/wsize plus a larger readahead setting should get
better performance is when the network connection is a "long, fat pipe"
such as a high bandwidth WAN connection. (Basically, you need to push a lot
of bits down the TCP pipe before you wait for an RPC reply, to try and keep
the long, fat pipe filled.
In theory NFSv4 was meant for the Internet.
Does anyone use it on WAN links. Probably yes, but not typically.

I have no idea what Linux uses, except that packet traces often show page
size (4K) I/O sizes, but not always.

For UDP, I think the FreeBSD default is 16K for NFSv3 (UDP is not allowed for
NFSv4 since congestion control at the transport level is required by the RFCs). Congestion control and reliability is why I always use TCP and, again, for 100Mbit/sec networking, I see wire speed. Both Linux and Solaris use TCP by default for NFSv3 mounts, which is mainly why it is the default for FreeBSD too.

>The defaults are not documented in the man page, and the current
>settings are almost equally impossible to see (e.g., mount -v doesn't
>how them).  The defaults are not quite impossible to see in the source
>code of course, but the source code for them is especially convoluted.
For FreeBSD, "nfsstat -m" on the client shows what is actually being used.
(I think Linux has a similar option, but I can't remember for sure?)

[lots of good stuff snipped]

rick