ZFS - NFS server for VMware ESXi issues
Rick Macklem
rmacklem at uoguelph.ca
Mon Oct 24 06:55:21 UTC 2016
Marek Salwerowicz wrote:
>Hi Rick,
>
>W dniu 2016-10-21 o 23:47, Rick Macklem pisze:
>>
>>
>> Btw, about the only area of the NFS server that might need tuning is
>> the DRC and
>> this doesn't suggest that. If you "nfsstat -e -s" on the server and
>> see large #s for
>> the last line under "Server Cache Stats:" there are tunables that can
>> be used.
>> I'd also suggest you capture the output of "ps axHl" on the server
>> when it happens
>> again, which tells you what all the nfsd threads are up to.
>
>I checked the
>#ps axHL | grep nfs
>now:
>http://pastebin.com/x9LTN0nn
What is there now is normal. "rpcsvc" just means the thread is waiting for an RPC
request from a client. This info might be useful when the server is hung/livelocked.
>
>it looks like I have ~64 threads of nfs each cosuming ~one hour of CPU time.
If one hour of CPU seems excessive for you, you can disable the DRC.
See below w.r.t. this.
>That corresponds to:
># ps axl | grep nfs
>UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAN
> 0 1948 1 0 28 0 24632 5832 select Is - 0:00.10
>nfsd: master (nfsd)
> 0 1949 1948 0 24 0 12344 4132 rpcsvc I - 66:56.42
>nfsd: server (nfsd)
>
>is it OK if threads are not being "recuperated" ?
Not sure what you mean by this, but newer FreeBSD systems have minthreads and
maxthreads options on the nfsd to set lower/upper bounds on the # of threads.
To be honest, having too many threads doesn't have much negative impact, so I
wouldn't worry about having too many.
>The NFS statistics are as follows:
># nfsstat -e -s
>
>Server Info:
> Getattr Setattr Lookup Readlink Read Write Create
>Remove
> 97818 311 107539 0 12018551 25266454 858 567
> Rename Link Symlink Mkdir Rmdir Readdir RdirPlus
>Access
> 296 0 0 0 0 0 427 7216
> Mknod Fsstat Fsinfo PathConf Commit LookupP SetClId
>SetClIdCf
> 0 2232 0 0 0 0 0 0
> Open OpenAttr OpenDwnGr OpenCfrm DelePurge DeleRet GetFH Lock
> 0 0 0 0 0 0 0 0
> LockT LockU Close Verify NVerify PutFH PutPubFH
>PutRootFH
> 0 0 0 0 0 0 0 0
> Renew RestoreFH SaveFH Secinfo RelLckOwn V4Create
> 0 0 0 0 0 0
>Server:
>Retfailed Faults Clients
> 0 0 0
>OpenOwner Opens LockOwner Locks Delegs
> 0 0 0 0 0
>Server Cache Stats:
> Inprog Idem Non-idem Misses CacheSize TCPPeak
> 0 0 0 37502946 94 592
>
>
>Is there any way I could decreas number of misses ?
Break your network badly;-)
You don't want hits for a Duplicate Request Cache (DRC). It doesn't improve performance,
but improves correctness by avoiding an RPC from being performed multiple times on
the server. (ie. Hits are BAD. Since the first 3 numbers are 0, there are 0 hits and that is
good. A DRC is mainly for UDP mounts where the client retries the RPC too agressively.
For TCP, RPCs are only retried when a client does a TCP reconnect.)
Disabling the DRC will reduce the CPU overheads, but does put your data at risk if/when
a client does a TCP reconnect.
You can disable the DRC for TCP via:
sysctl vfs.nfsd.cachetcp=0
OR
sysctl vfs.nfsd.tcphighwater=100000
allows the cache to grow larger, reducing the CPU overheads that occur when it
does housekeeping of it. (Trading CPU for kernel memory use.)
Again, disabling the cache will reduce CPU overheads, but does put your data at
risk if/when a client does a TCP reconnect and resends outstanding RPCs to the server.
I doubt any of this DRC tuning will affect your hangs.
Good luck with it, rick
More information about the freebsd-fs
mailing list