NFS home directory performance tuning for Linux client

Kaya Saman kayasaman at gmail.com
Mon Aug 21 18:47:09 UTC 2017


So, currently I've tried doing something a little different which worked 
out well.


I moved away from NFS and setup a ZVOL on the spare SSD mentioned below 
then set the server up as an iscsi target.


With the Linux machine running as the iscsi initiator then creating a 
partition and filesystem <- went with JFS which is my preffered on Linux 
; it's running quite smoothly at present.


The bottleneck was definitely caused by NFS and I think it was the write 
behaviour for small files, with the limited options available though I 
have no idea what could be causing the issues or how to get round them??



-------- Forwarded Message --------
Subject: 	NFS home directory performance tuning for Linux client
Date: 	Mon, 21 Aug 2017 15:00:14 +0100
From: 	Kaya Saman <kayasaman at gmail.com>
To: 	freebsd-questions <freebsd-questions at freebsd.org>



Hi,


I'm testing an Arch Linux client with my FreeBSD server which has
recently been updated to 11.1. The server runs a zpool spread over 15x
disks with SSD L2ARC and also just as another test point I am using a
separate SSD (zpool over 1x disk) in the server to compare and contrast
with.


For non-home NFS mounts I found Version4 to have good performance
however, when increasing the MTU size in the network: NIC's, switches,
routing etc... to 9000 ; I tend to see a lot of server timeouts, even
with rsize and wsize increased to 8192.


Hard setting the Linux clients to vers=3 in fstab sees stability, as in
no timeouts, and with no apparent decrease in performance either.


What is odd however, is that FreeBSD to FreeBSD will just work without
any issues at all, so I'm wondering if the NFS implementation on Linux
is slightly different causing these issues??


To the main question/issue however, when using as NFS home directory
setting vers=3 on the client makes the system unuseable. It takes
roughly 5-10 mins after login for anything to appear on screen then
again after clicking somewhere another 5-10mins for the response.

- setting to vers=4 improves things significantly but still if trying to
use an application like Chromium then the system will hang upon browsing
for 5-10 mins then come alive again??


I have set the server up as follows in rc.conf:


nfs_server_flags="-t -n 128 -h <IP>"
nfs_server_enable="YES"
nfsv4_server_enable="YES"
nfsuserd_enable="YES"
nfsuserd_flags="-domain domian.com"
rpc_statd_enable="YES"
rpc_lockd_enable="YES"
rpcbind_enable="YES"
rpcbind_flags="-h <IP>"
mountd_enable="YES"
mountd_flags="-r -n -l -h <IP>"


I've even tried to increase the sysctl variable: vfs.nfs.iodmax from 20
to 60


On the client side the fstab entry contains the following options:


vers=4,defaults,auto,tcp,retrans=10,timeo=30,rsize=8192,wsize=8192,noatime


and gets mounted to /mnt/home. I realize the 'tcp' flag doesn't need to
be there as v4 by default uses 'tcp' however, it is there when testing
with v3.



nfsstat command on server gives:


Client Info:
Rpc Counts:
   Getattr   Setattr    Lookup  Readlink      Read     Write Create
Remove
    139585         0    399150         0    137485 0         0         0
    Rename      Link   Symlink     Mkdir     Rmdir   Readdir RdirPlus
Access
         0         0         0         0         0 138482         0
393052
     Mknod    Fsstat    Fsinfo  PathConf    Commit
         0     94496         4         0         0
Rpc Info:
  TimedOut   Invalid X Replies   Retries  Requests
         0         0         0         0   1302226
Cache Info:
Attr Hits    Misses Lkup Hits    Misses BioR Hits    Misses BioW Hits
Misses
  16703860    139581  13000702    399150    667954 139782
0         0
BioRLHits    Misses BioD Hits    Misses DirE Hits    Misses Accs Hits
Misses
         0         0    116186    116459     94263         0 13744785
393052

Server Info:
   Getattr   Setattr    Lookup  Readlink      Read     Write Create
Remove
   3200367     39011     89025        41 203807584    806982 410      7656
    Rename      Link   Symlink     Mkdir     Rmdir   Readdir RdirPlus
Access
      6383       101         2         6         1      2880 265953
1643074
     Mknod    Fsstat    Fsinfo  PathConf    Commit
         0     94770        30        15     18370
Server Ret-Failed
                 0
Server Faults
             0
Server Cache Stats:
    Inprog      Idem  Non-idem    Misses
         0         0         0 209101544
Server Write Gathering:
  WriteOps  WriteRPC   Opsaved
    806982    806982         0


And nfsstat on client:


Client rpc stats:
calls      retrans    authrefrsh
327404     1293       327395

Client nfs v3:
null             getattr          setattr lookup           access
0         0%     367      49%     0         0%     3 0%     3         0%
readlink         read             write create           mkdir
0         0%     0         0%     0         0%     0 0%     0         0%
symlink          mknod            remove rmdir            rename
0         0%     0         0%     0         0%     0 0%     0         0%
link             readdir          readdirplus fsstat           fsinfo
0         0%     0         0%     1         0%     363 49%     2         0%
pathconf         commit
1         0%     0         0%

Client nfs v4:
null             read             write commit           open
0         0%     16452     5%     169214   51%     6150 1%     17026     5%
open_conf        open_noat        open_dgrd close            setattr
11        0%     0         0%     4         0%     12719 3%
19691     6%
fsinfo           renew            setclntid confirm          lock
12        0%     480       0%     6         0%     6 0%     11578     3%
lockt            locku            access getattr          lookup
35        0%     10085     3%     5545      1%     24591 7%
17443     5%
lookup_root      remove           rename link             symlink
3         0%     1612      0%     4266      1%     31 0%     15        0%
create           pathconf         statfs readlink         readdir
105       0%     9         0%     7093      2%     4 0%     398       0%
server_caps      delegreturn      getacl setacl           fs_locations
21        0%     0         0%     0         0%     0 0%     0         0%
rel_lkowner      secinfo          fsid_present exchange_id
create_session
2051      0%     0         0%     0         0%     0 0%     0         0%
destroy_session  sequence         get_lease_time reclaim_comp     layoutget
0         0%     0         0%     0         0%     0 0%     0         0%
getdevinfo       layoutcommit     layoutreturn secinfo_no
test_stateid
0         0%     0         0%     0         0%     0 0%     0         0%
free_stateid     getdevicelist    bind_conn_to_ses destroy_clientid seek
0         0%     0         0%     0         0%     0 0%     0         0%
allocate         deallocate       layoutstats clone
0         0%     0         0%     0         0%     0 0%



The server isn't loaded at all, load is around 0.40 and the network is
also pretty free as the system has 4x NIC's in lagg with current
throughput under 10Mb/s.


Would anyone be able to offer any advice?


Many thanks.




More information about the freebsd-questions mailing list