High Kernel Load with nfsv4

Loïc Blot loic.blot at unix-experience.fr
Wed Dec 10 11:33:27 UTC 2014


Hi Rick,
I'm trying NFSv3.
Some jails are starting very well but now i have an issue with lockd after some minutes:

nfs server 10.10.X.8:/jails: lockd not responding
nfs server 10.10.X.8:/jails lockd is alive again

I look at mbuf, but i seems there is no problem.

Here is my rc.conf on server:

nfs_server_enable="YES"
nfsv4_server_enable="YES"
nfsuserd_enable="YES"
nfsd_server_flags="-u -t -n 256"
mountd_enable="YES"
mountd_flags="-r"
nfsuserd_flags="-usertimeout 0 -force 20"
rpcbind_enable="YES"
rpc_lockd_enable="YES"
rpc_statd_enable="YES"

Here is the client:

nfsuserd_enable="YES"
nfsuserd_flags="-usertimeout 0 -force 20"
nfscbd_enable="YES"
rpc_lockd_enable="YES"
rpc_statd_enable="YES"

Have you got an idea ?

Regards,

Loïc Blot,
UNIX Systems, Network and Security Engineer
http://www.unix-experience.fr

9 décembre 2014 04:31 "Rick Macklem" <rmacklem at uoguelph.ca> a écrit: 
> Loic Blot wrote:
> 
>> Hi rick,
>> 
>> I waited 3 hours (no lag at jail launch) and now I do: sysrc
>> memcached_flags="-v -m 512"
>> Command was very very slow...
>> 
>> Here is a dd over NFS:
>> 
>> 601062912 bytes transferred in 21.060679 secs (28539579 bytes/sec)
> 
> Can you try the same read using an NFSv3 mount?
> (If it runs much faster, you have probably been bitten by the ZFS
> "sequential vs random" read heuristic which I've been told things
> NFS is doing "random" reads without file handle affinity. File
> handle affinity is very hard to do for NFSv4, so it isn't done.)
> 
> rick
> 
>> This is quite slow...
>> 
>> You can found some nfsstat below (command isn't finished yet)
>> 
>> nfsstat -c -w 1
>> 
>> GtAttr Lookup Rdlink Read Write Rename Access Rddir
>> 0 0 0 0 0 0 0 0
>> 4 0 0 0 0 0 16 0
>> 2 0 0 0 0 0 17 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 4 0 0 0 0 4 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 4 0 0 0 0 0 3 0
>> 0 0 0 0 0 0 3 0
>> 37 10 0 8 0 0 14 1
>> 18 16 0 4 1 2 4 0
>> 78 91 0 82 6 12 30 0
>> 19 18 0 2 2 4 2 0
>> 0 0 0 0 2 0 0 0
>> 0 0 0 0 0 0 0 0
>> GtAttr Lookup Rdlink Read Write Rename Access Rddir
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 1 0 0 0 0 1 0
>> 4 6 0 0 6 0 3 0
>> 2 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 1 0 0 0 0 0 0 0
>> 0 0 0 0 1 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 6 108 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> GtAttr Lookup Rdlink Read Write Rename Access Rddir
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 98 54 0 86 11 0 25 0
>> 36 24 0 39 25 0 10 1
>> 67 8 0 63 63 0 41 0
>> 34 0 0 35 34 0 0 0
>> 75 0 0 75 77 0 0 0
>> 34 0 0 35 35 0 0 0
>> 75 0 0 74 76 0 0 0
>> 33 0 0 34 33 0 0 0
>> 0 0 0 0 5 0 0 0
>> 0 0 0 0 0 0 6 0
>> 11 0 0 0 0 0 11 0
>> 0 0 0 0 0 0 0 0
>> 0 17 0 0 0 0 1 0
>> GtAttr Lookup Rdlink Read Write Rename Access Rddir
>> 4 5 0 0 0 0 12 0
>> 2 0 0 0 0 0 26 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 4 0 0 0 0 4 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 4 0 0 0 0 0 2 0
>> 2 0 0 0 0 0 24 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> GtAttr Lookup Rdlink Read Write Rename Access Rddir
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 4 0 0 0 0 0 7 0
>> 2 1 0 0 0 0 1 0
>> 0 0 0 0 2 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 6 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 4 6 0 0 0 0 3 0
>> 0 0 0 0 0 0 0 0
>> 2 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> GtAttr Lookup Rdlink Read Write Rename Access Rddir
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 4 71 0 0 0 0 0 0
>> 0 1 0 0 0 0 0 0
>> 2 36 0 0 0 0 1 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 1 0 0 0 0 0 1 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 79 6 0 79 79 0 2 0
>> 25 0 0 25 26 0 6 0
>> 43 18 0 39 46 0 23 0
>> 36 0 0 36 36 0 31 0
>> 68 1 0 66 68 0 0 0
>> GtAttr Lookup Rdlink Read Write Rename Access Rddir
>> 36 0 0 36 36 0 0 0
>> 48 0 0 48 49 0 0 0
>> 20 0 0 20 20 0 0 0
>> 0 0 0 0 0 0 0 0
>> 3 14 0 1 0 0 11 0
>> 0 0 0 0 0 0 0 0
>> 0 0 0 0 0 0 0 0
>> 0 4 0 0 0 0 4 0
>> 0 0 0 0 0 0 0 0
>> 4 22 0 0 0 0 16 0
>> 2 0 0 0 0 0 23 0
>> 
>> Regards,
>> 
>> Loïc Blot,
>> UNIX Systems, Network and Security Engineer
>> http://www.unix-experience.fr
>> 
>> 8 décembre 2014 09:36 "Loïc Blot" <loic.blot at unix-experience.fr> a
>> écrit:
>>> Hi Rick,
>>> I stopped the jails this week-end and started it this morning, i'll
>>> give you some stats this week.
>>> 
>>> Here is my nfsstat -m output (with your rsize/wsize tweaks)
>>> 
>>> 
>> 
> nfsv4,tcp,resvport,hard,cto,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negna
>>> 
>> 
> etimeo=60,rsize=32768,wsize=32768,readdirsize=32768,readahead=1,wcommitsize=773136,timeout=120,retra
>>> s=2147483647
>>> 
>>> On server side my disks are on a raid controller which show a 512b
>>> volume and write performances
>>> are very honest (dd if=/dev/zero of=/jails/test.dd bs=4096
>>> count=100000000 => 450MBps)
>>> 
>>> Regards,
>>> 
>>> Loïc Blot,
>>> UNIX Systems, Network and Security Engineer
>>> http://www.unix-experience.fr
>>> 
>>> 5 décembre 2014 15:14 "Rick Macklem" <rmacklem at uoguelph.ca> a
>>> écrit:
>>> 
>>>> Loic Blot wrote:
>>>> 
>>>>> Hi,
>>>>> i'm trying to create a virtualisation environment based on jails.
>>>>> Those jails are stored under a big ZFS pool on a FreeBSD 9.3
>>>>> which
>>>>> export a NFSv4 volume. This NFSv4 volume was mounted on a big
>>>>> hypervisor (2 Xeon E5v3 + 128GB memory and 8 ports (but only 1
>>>>> was
>>>>> used at this time).
>>>>> 
>>>>> The problem is simple, my hypervisors runs 6 jails (used 1% cpu
>>>>> and
>>>>> 10GB RAM approximatively and less than 1MB bandwidth) and works
>>>>> fine at start but the system slows down and after 2-3 days become
>>>>> unusable. When i look at top command i see 80-100% on system and
>>>>> commands are very very slow. Many process are tagged with
>>>>> nfs_cl*.
>>>> 
>>>> To be honest, I would expect the slowness to be because of slow
>>>> response
>>>> from the NFSv4 server, but if you do:
>>>> # ps axHl
>>>> on a client when it is slow and post that, it would give us some
>>>> more
>>>> information on where the client side processes are sitting.
>>>> If you also do something like:
>>>> # nfsstat -c -w 1
>>>> and let it run for a while, that should show you how many RPCs are
>>>> being done and which ones.
>>>> 
>>>> # nfsstat -m
>>>> will show you what your mount is actually using.
>>>> The only mount option I can suggest trying is
>>>> "rsize=32768,wsize=32768",
>>>> since some network environments have difficulties with 64K.
>>>> 
>>>> There are a few things you can try on the NFSv4 server side, if it
>>>> appears
>>>> that the clients are generating a large RPC load.
>>>> - disabling the DRC cache for TCP by setting vfs.nfsd.cachetcp=0
>>>> - If the server is seeing a large write RPC load, then
>>>> "sync=disabled"
>>>> might help, although it does run a risk of data loss when the
>>>> server
>>>> crashes.
>>>> Then there are a couple of other ZFS related things (I'm not a ZFS
>>>> guy,
>>>> but these have shown up on the mailing lists).
>>>> - make sure your volumes are 4K aligned and ashift=12 (in case a
>>>> drive
>>>> that uses 4K sectors is pretending to be 512byte sectored)
>>>> - never run over 70-80% full if write performance is an issue
>>>> - use a zil on an SSD with good write performance
>>>> 
>>>> The only NFSv4 thing I can tell you is that it is known that ZFS's
>>>> algorithm for determining sequential vs random I/O fails for NFSv4
>>>> during writing and this can be a performance hit. The only
>>>> workaround
>>>> is to use NFSv3 mounts, since file handle affinity apparently
>>>> fixes
>>>> the problem and this is only done for NFSv3.
>>>> 
>>>> rick
>>>> 
>>>>> I saw that there are TSO issues with igb then i'm trying to
>>>>> disable
>>>>> it with sysctl but the situation wasn't solved.
>>>>> 
>>>>> Someone has got ideas ? I can give you more informations if you
>>>>> need.
>>>>> 
>>>>> Thanks in advance.
>>>>> Regards,
>>>>> 
>>>>> Loïc Blot,
>>>>> UNIX Systems, Network and Security Engineer
>>>>> http://www.unix-experience.fr
>>>>> _______________________________________________
>>>>> freebsd-fs at freebsd.org mailing list
>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>>>>> To unsubscribe, send any mail to
>>>>> "freebsd-fs-unsubscribe at freebsd.org"
>>> 
>>> _______________________________________________
>>> freebsd-fs at freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>>> To unsubscribe, send any mail to
>>> "freebsd-fs-unsubscribe at freebsd.org"


More information about the freebsd-fs mailing list