Re: optimising nfs and nfsd

From: Rick Macklem <rick.macklem_at_gmail.com>
Date: Sun, 29 Oct 2023 20:28:03 UTC
On Sun, Oct 29, 2023 at 4:41 AM void <void@f-m.fm> wrote:
>
> Hello list,
>
> The nfs instructions in the handbook are rather terse.
>
> I know there's been lots of new development with nfs.
> The zfs property of sharenfs, version4, KTLS etc.
> The "readahead" property for clients.
>
> Would anyone here please point me to up-to-date
> resources? My context is nfs server exporting
> via sharenfs with freebsd14, -current and debian-based
> linux clients on a gigabit LAN.
I have a few primitive docs, but they do not cover what you
are interested in (all found at https://people.freebsd.org/~rmacklem):
nfs-krb5-setup.txt
nfs-over-tls-setup.txt
nfsd-vnet-prison-setup.txt
pnfs-planb-setup.txt

However, here are a few comments that might be useful...
- If you do "nfsstat -m" on the client(s), you will find out what
  they are using. If the mounts are NFSv4.0, consider switching
  to NFSv4.1/4.2. (I consider NFSv4.0 a deprecated protocol.
   NFSv4.1/4.2 does assorted things better, including something
   called "sessions" which replaces use of the DRC.)
  If the mounts are NFSv3 and work ok for you, that's fine. NFSv3
  will actually perform better than NFSv4, but will lack things like
  good byte range locking support.
- If you do something like:
  dd if=/nfsmount/bigfile of=/dev/null bs=1M
  and get wire speed (100+ Mbytes/sec for 1Gbps), then
  there probably is not much more you can do performance wise.
  Mount options like "readahead/rsize/wsize/nconnect" can improve
  performance if the mount is not already running at close to wirespeed.
  (They are all in the "try it and see if it helps/your mileage may vary"
    category.)
   The Linux client folk do try and make defaults work well.

Interrupt moderation...
- Most NICs do not generate an interrupt for every packet sent/received
   to avoid an interrupt flood. Unfortunately, this can delay RPC message
   handling and have a negative impact on NFS performance, since it
   primarily depends on RPC RTT and not bandwidth. Most NIC drivers
   do have tunable(s) for this. Again, your mileage may vary..

If you are using NFSv3 or NFSv4.0 mounts, performance can be
improved by tuning or disabling the duplicate request cache (DRC).
The DRC is an oddball, in that it improves correctness (avoiding
non-idempotent RPCs being performed multiple times), but slows
performance. For a good LAN, TCP may not need this. (For TCP
mounts, RPCs are only retried after the RPC layer gives up on a
TCP connection and creates a new one. With a good LAN, this
should be a rare occurrence.)
# sysctl vfs.nfsd.cachetcp=0
is the extreme tuning case that turns the DRC off to TCP.
(Again, this is irrelevant for NFSv4.1/4.2 mounts, since they do
 not use the DRC.)

NFS-over-TCP (called RPC-over-TCP by the Linux folk) is
discussed in one of the primitive docs mentioned above.
It uses the KTLS to run all the NFS traffic within a TLS1.3
session (not the same as an NFSv4.1/4.2 session).
This has obvious security advantages, but can result in
about 1/3rd of a CPU core being used/per NFS connection
for encryption/decryption when the NFS mount is busy.
I am not sure quite where the LInux client patches are
at this point. I know they have been testing them, but I
suspect you need a very recent Linux kernel to get the
support, so I doubt they are in most distros yet?

In summary, if you are getting near wire speed and you
are comfortable with your security situation, then there
isn't much else to do.

rick



>
> Some workloads are very interactive. Some are not,
> such as backups. It works, but it maybe can work
> better.
>
> Thanks!
> --
>