Linux NFS client and FreeBSD server strangeness

Fri Apr 6 01:38:24 UTC 2018

Mike Tancsa wrote:
>Thank you for all the feedback, pointers/insights.  Coming directly from
>'Mr. NFS', its particularly appreciated :)
I could replace "Mr. NFS" with "Mr. stupid enough to do NFS without getting
paid to do it";-)
However, I should note that, although I am fairly familiar with the protocol and
the FreeBSD code, I don't have a lot of experience w.r.t. performance, at least
on newer hardware and fast networking.
[good stuff snipped]
>I think the root of the issue partially stems from the client having a
>LOT of RAM. So according to this default behaviour
There is a now rather ancient connectathon test suite for NFS, where one
of the tests is writing/reading a large 10Mbyte file. The 10Mbyte size was
selected because it was guaranteed to exceed the NFS client's buffer cache
capacity. (Maybe no longer true;-)

>----------------
>       The NFS client treats the sync mount option differently than some
>other file systems (refer to mount(8) for a description of  the
>       generic  sync  and  async  mount options).  If neither sync nor
>async is specified (or if the async option is specified), the NFS
>       client delays sending application writes to the server until any
>of these events occur:
>
>              Memory pressure forces reclamation of system memory resources.
>
>              An application flushes file data explicitly with sync(2),
>msync(2), or fsync(3).
>
>              An application closes a file with close(2).
>
>              The file is locked/unlocked via fcntl(2).
>
>      In other words, under normal circumstances, data written by an
>application may not immediately appear on the  server  that  hosts
>      the file.
>-----------------------------
Just fyi, the FreeBSD client starts a write when the buffer cache block is
completely written with new data. (Called B_ASYNC in the code.)
If only part of a block has been written with new data, the write is delayed
until it is fully written with new data or one of the above cases applies.

You might want to slap to-gether a test program that loops on write(2) for
a while, does an fsync(2), then more writing...
and see how that performs on both FreeBSD and Linux clients.

I do find the fact that doing an "ls" concurrently with the writes makes things
work better interesting/weird. All the "ls" will do is inject a bunch of other RPC messages into the TCP stream (small ones in the client->server direction).
The only thing I can think of is that the net interface is somehow "awakened"
by the small RPC messages (each one almost always in one TCP segment).
(Maybe something related to how the net interface device driver handles
 receive interrupts or ???)
If you find the "magic bullet" that makes the Linux case work well without
the concurrent "ls", please post and let us know what it is.

Good luck with it, rick
[lots of stuff snipped]