Re: NFSv4 client hung
- In reply to: Alexandre Biancalana : "Re: NFSv4 client hung"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 02 Sep 2025 21:19:32 UTC
On Tue, Sep 2, 2025 at 9:46 AM Alexandre Biancalana <biancalana@gmail.com> wrote: > > On Tue, Sep 2, 2025 at 11:00 AM Rick Macklem <rick.macklem@gmail.com> wrote: > > > > On Tue, Sep 2, 2025 at 6:01 AM Alexandre Biancalana > > <biancalana@gmail.com> wrote: > > > > > > Hi Rick! Thank you for the answer. > > > > > > I also think that it has nothing to do with the server because there’s other nfs client (also running vms with bhyve) that keeps running. > > > > > > To make sure that I understood, in my setup the nfs client is a physical host that mount nfs share with vms disks. Then I run those vms with bhyve, that vms does not mount any nfs share. > > > The hang happens when I try to access the nfs mounted shares in the physical host and (i think) as consequences the vms also freeze when trying to do io. > > > > > > Your suggestion is to increase the amount of memory of the vms ? > > Oops, yes, the buffer cache problem would be on the physical system, > > given that is where the mount is done. > > > > > > > > For educational purposes, can you point me the code part that uses newbuf so i can try to learn something? > > sys/kern/vfs_bio.c > > Thanks I'm reading ! > > > > > There are some sysctls you can look at. You'll get them by: > > # sysctl -a | fgrep vfs | fgrep buffer > root@bhyve01:~ # sysctl -a | fgrep vfs | fgrep buffer > vfs.hifreebuffers: 5376 > vfs.lofreebuffers: 3584 > vfs.numfreebuffers: 105931 > vfs.hidirtybuffers: 26502 > vfs.lodirtybuffers: 13251 > vfs.numdirtybuffers: 220 > vfs.altbufferflushes: 0 > vfs.dirtybufferflushes: 0 > > > # sysctl -a | fgrep bufspace > root@bhyve01:~ # sysctl -a | fgrep bufspace > vfs.bufspacethresh: 1681960548 > vfs.hibufspace: 1725087744 > vfs.lobufspace: 1638833353 > vfs.maxmallocbufspace: 86254387 > vfs.maxbufspace: 1735573504 > vfs.bufspace: 502060032 > vfs.runningbufspace: 0 > > > > - Some of these can be adjusted. If you look in sys/kern/vfs_bio.c, > > you can see which ones are CTLFLAG_RW. > > I've instrummented a collection of those values each 10s storing in a > tsdb (/usr/sbin/prometheus_sysctl_exporter| grep vfs | grep buf), so > we can track the values overtime. > I still haven't got the mechanism, but what I think that makes sense > to measure/watch is: > > - runningbufspace: it the number of outstanding requests grow a lot, > can be a signal of stall > - bufkvaspace > - bufspace/maxbufspace = total usage of bufspace > - bufmallocspace/maxmallocbufspace = total usage of malloced memory for buffers > - bdwriteskip > - numdirtybuffers: data not persisted to backing store > - numfreebuffers > - lofreebuffers > - getnewbufrestarts > - mappingrestarts > - numbufallocfails > - notbufdflushes > > > > > Also, you can see exactly what the NFS mount setup is by: > > # nfsstat -m > > - If you post the output from this, I might be able to suggest > > some mount option changes. > > As I said, I have two machines, they had the same config. When I > started to have the problem I removed all the tuning and rolled back > to nfsv3 in bhyve01. Sadly bhyve01 still hangs from time to time. I'm > going to share nfsstat from both machines. > > root@bhyve01:~ # nfsstat -m > 10.10.10.10:/mnt/datastore0/bhyve_instances on /vms > nfsv3,tcp,resvport,nconnect=1,hard,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=1048576,wsize=1048576,readdirsize=1048576,readahead=1,wcommitsize=16777216,timeout=120,retrans=2 > 10.10.10.10:/mnt/datastore1/iso on /vms/.iso > nfsv3,tcp,resvport,nconnect=1,hard,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=1048576,wsize=1048576,readdirsize=1048576,readahead=1,wcommitsize=16777216,timeout=120,retrans=2 > 10.10.10.10:/mnt/ds_ssd_vms_03/disks on /vms/.disks > nfsv3,tcp,resvport,nconnect=1,hard,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=1048576,wsize=1048576,readdirsize=1048576,readahead=1,wcommitsize=16777216,timeout=120,retrans=2 > > root@bhyve02:~ # nfsstat -m > 10.10.10.10:/mnt/datastore0/bhyve_instances on /vms > nfsv4,minorversion=2,tcp,resvport,nconnect=16,hard,cto,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=1048576,wsize=1048576,readdirsize=1048576,readahead=8,wcommitsize=67108864,timeout=120,retrans=2147483647 > 10.10.10.10:/mnt/datastore1/iso on /vms/.iso > nfsv4,minorversion=2,tcp,resvport,nconnect=16,hard,cto,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=1048576,wsize=1048576,readdirsize=1048576,readahead=8,wcommitsize=67108864,timeout=120,retrans=2147483647 > 10.10.10.10:/mnt/ds_ssd_vms_03/disks on /vms/.disks > nfsv4,minorversion=2,tcp,resvport,nconnect=16,hard,cto,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=1048576,wsize=1048576,readdirsize=1048576,readahead=8,wcommitsize=67108864,timeout=120,retrans=2147483647 The only mount option I can see that might be worth fiddling with is "wcommitsize". It is "how much can be cached before a commit is done". The buffer cannot by re-used until it is commited, so you might try making it smaller? rick > > > > > > I do not know how bhyve reads/writes the image file? > > (That might be a hint as well, since that is probably > > what is unique about your setup.) > > > > rick > > > > > > > > Ale > > > > > > On Mon, 1 Sep 2025 at 22:49 Rick Macklem <rick.macklem@gmail.com> wrote: > > >> > > >> For some reason, I cannot reply to your email > > >> (might be the size of it), so I'll post a simple > > >> comment. > > >> > > >> As you noted, processed are stuck on newbuf in > > >> the client. This probably has nothing to do with > > >> the server. It also looks like the clients are bhyve. > > >> > > >> Bump the memory size of the bhyve clients up, > > >> maybe way up. > > >> --> There are ways to tune the size of the buffer > > >> cache, but bumping up the VM's ram should > > >> give you more. > > >> > > >> rick