Re: optimising nfs and nfsd

From: void <void_at_f-m.fm>
Date: Tue, 31 Oct 2023 17:18:19 UTC
On Mon, Oct 30, 2023 at 05:31:50PM -0700, Rick Macklem wrote:

>Well, here's a couple more things to look at:
>- Number of nfsd threads. I prefer to set the min/max to the same
>  value (which is what the "-n" option on nfsd does).  Then, after
>  the server has been running for a while in production load, I do:
>  # ps axHl | fgrep nfsd
>  and I look to see how many of the threads have a TIME of
>  0:00.00. (These are extra tthreads that are not needed.)
>  If there is a moderate number of these, I consider it aok.
>  If there are none of these, more could improve NFS performance.
>  If there are lots of these, the number can be decreased, but they
>  don't result in much overhead, so I err on the large # side.
>  - If you have min set to less than max, the above trick doesn't
>    work, but I'd say that if the command shows the max# of threads,
>    it could be increased.
>This number can be configured via options on the nfsd command line.
>If you aren't running nfsd in a jail, you can also fiddle with them via
>the sysctls:
>vfs.nfsd.minthreads
>vfs.nfsd.maxthreads

root@storage:/root# ps axHl | ug -e server | ug -e "0:00.00" | ug -e rpcsvc | wc -l
       26

How many is a moderate number? Right now there's three clients connected
using version 3, one of the clients is doing a lot of (small) i/o

root@storage:/root# ug nfs /etc/rc.conf*
/etc/rc.conf
     19: nfs_server_enable="YES"

I've not specified -n anywhere, so all defaults, nothing in /etc/exports.
All nfsd capability is via the sharenfs zfs property 

>The caveat is that, if the NFS server is also doing other things,
>increasing the number of nfsd threads can result in nfsd "hogging"
>the system.

NFSD is the machine's only role, so am concerned only with I guess
high-availability/responsiveness, and tuning, if it needs it so it
functions best with the hardware.

The cpu on the nfs server is a Xeon E5-2407 @2.2GHz. HT is disabled,
so 4 cpus available, and there's 64GB RAM. I can't do much about the CPU
apart from enabling HT which might not be an issue as there's no
routed ipv4. I can increase the RAM if required though.

This top(1) output is fairly typical:

last pid: 55448;  load averages:  0.04,  0.12,  0.14 up 12+10:20:03  14:55:45
54 processes:  1 running, 53 sleeping
CPU:  0.1% user,  0.0% nice,  4.0% system,  0.1% interrupt, 95.8% idle
Mem: 96K Active, 879M Inact, 182M Laundry, 59G Wired, 1965M Free
ARC: 54G Total, 21G MFU, 31G MRU, 664M Header, 1362M Other
      50G Compressed, 78G Uncompressed, 1.56:1 Ratio
      Swap: 8192M Total, 8192M Free

   PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
   82935 root         32 -60    0    12M  2276K rpcsvc   0 113:03   4.22% nfsd

I could probably set vfs.zfs.arc.max. Right now it's at the default 0 = no maximum.

>NFSv4 server hash table sizes:
>Run "nfsstat -E -s" on the server after it has been up under production
>load for a while.

it seems v4 isn't running or able to run. The client gets an error 
if I try to specify vers=4 in the sharenfs property on the server:

zfs set sharenfs="vers=4 maproot=root -alldirs -network 192.168.1.0 
-mask 255.255.255.0" zroot/vms-volumes

then "service mountd restart && service nfsd restart"

then, on the client
# mount /vms-volumes
[tcp] 192.168.1.102:/zroot/vms-volumes: Permission denied

removing the vers=4 then restart rpcbind & nfsd on the server, it works fine, 
but version 3.

>Look at the section near the end called "Server:".
>The number under "Clients" should be roughly the number of client
>systems that have NFSv4 mounts against the server.

yep that shows 0 as expected as no v4. Here's the stats with just -s

Server Info:
       Getattr Setattr Lookup Readlink Read    Write    Create Remove
       85998   942     176996 0        8661219 33713618 0      898
       Rename  Link  Symlink  Mkdir    Rmdir Readdir    RdirPlus Access
       9       0     0        15       43    62         55       23111
       Mknod       Fsstat       Fsinfo     PathConf       Commit
           0          191           39           42       111088

Server Write 
      WriteOps     WriteRPC      Opsaved
      33713618     33713618            0
Server Cache 
        Inprog     Non-Idem       Misses
             0            0     42775262

thanks again for your help and advice, I'm taking notes :D

The bit following 
>The two tunables:
>vfs.nfsd.clienthashsize
>vfs.nfsd.sessionhashsize

I'll reply to in a followup email

--