High Kernel Load with nfsv4
Loïc Blot
loic.blot at unix-experience.fr
Mon Dec 8 08:36:33 UTC 2014
Hi Rick,
I stopped the jails this week-end and started it this morning, i'll give you some stats this week.
Here is my nfsstat -m output (with your rsize/wsize tweaks)
nfsv4,tcp,resvport,hard,cto,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=32768,wsize=32768,readdirsize=32768,readahead=1,wcommitsize=773136,timeout=120,retrans=2147483647
On server side my disks are on a raid controller which show a 512b volume and write performances are very honest (dd if=/dev/zero of=/jails/test.dd bs=4096 count=100000000 => 450MBps)
Regards,
Loïc Blot,
UNIX Systems, Network and Security Engineer
http://www.unix-experience.fr
5 décembre 2014 15:14 "Rick Macklem" <rmacklem at uoguelph.ca> a écrit:
> Loic Blot wrote:
>
>> Hi,
>> i'm trying to create a virtualisation environment based on jails.
>> Those jails are stored under a big ZFS pool on a FreeBSD 9.3 which
>> export a NFSv4 volume. This NFSv4 volume was mounted on a big
>> hypervisor (2 Xeon E5v3 + 128GB memory and 8 ports (but only 1 was
>> used at this time).
>>
>> The problem is simple, my hypervisors runs 6 jails (used 1% cpu and
>> 10GB RAM approximatively and less than 1MB bandwidth) and works
>> fine at start but the system slows down and after 2-3 days become
>> unusable. When i look at top command i see 80-100% on system and
>> commands are very very slow. Many process are tagged with nfs_cl*.
>
> To be honest, I would expect the slowness to be because of slow response
> from the NFSv4 server, but if you do:
> # ps axHl
> on a client when it is slow and post that, it would give us some more
> information on where the client side processes are sitting.
> If you also do something like:
> # nfsstat -c -w 1
> and let it run for a while, that should show you how many RPCs are
> being done and which ones.
>
> # nfsstat -m
> will show you what your mount is actually using.
> The only mount option I can suggest trying is "rsize=32768,wsize=32768",
> since some network environments have difficulties with 64K.
>
> There are a few things you can try on the NFSv4 server side, if it appears
> that the clients are generating a large RPC load.
> - disabling the DRC cache for TCP by setting vfs.nfsd.cachetcp=0
> - If the server is seeing a large write RPC load, then "sync=disabled"
> might help, although it does run a risk of data loss when the server
> crashes.
> Then there are a couple of other ZFS related things (I'm not a ZFS guy,
> but these have shown up on the mailing lists).
> - make sure your volumes are 4K aligned and ashift=12 (in case a drive
> that uses 4K sectors is pretending to be 512byte sectored)
> - never run over 70-80% full if write performance is an issue
> - use a zil on an SSD with good write performance
>
> The only NFSv4 thing I can tell you is that it is known that ZFS's
> algorithm for determining sequential vs random I/O fails for NFSv4
> during writing and this can be a performance hit. The only workaround
> is to use NFSv3 mounts, since file handle affinity apparently fixes
> the problem and this is only done for NFSv3.
>
> rick
>
>> I saw that there are TSO issues with igb then i'm trying to disable
>> it with sysctl but the situation wasn't solved.
>>
>> Someone has got ideas ? I can give you more informations if you
>> need.
>>
>> Thanks in advance.
>> Regards,
>>
>> Loïc Blot,
>> UNIX Systems, Network and Security Engineer
>> http://www.unix-experience.fr
>> _______________________________________________
>> freebsd-fs at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
More information about the freebsd-fs
mailing list