Re: NFSv4 client hung
- Reply: Rick Macklem : "Re: NFSv4 client hung"
- In reply to: Rick Macklem : "Re: NFSv4 client hung"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 02 Sep 2025 16:46:14 UTC
On Tue, Sep 2, 2025 at 11:00 AM Rick Macklem <rick.macklem@gmail.com> wrote: > > On Tue, Sep 2, 2025 at 6:01 AM Alexandre Biancalana > <biancalana@gmail.com> wrote: > > > > Hi Rick! Thank you for the answer. > > > > I also think that it has nothing to do with the server because there’s other nfs client (also running vms with bhyve) that keeps running. > > > > To make sure that I understood, in my setup the nfs client is a physical host that mount nfs share with vms disks. Then I run those vms with bhyve, that vms does not mount any nfs share. > > The hang happens when I try to access the nfs mounted shares in the physical host and (i think) as consequences the vms also freeze when trying to do io. > > > > Your suggestion is to increase the amount of memory of the vms ? > Oops, yes, the buffer cache problem would be on the physical system, > given that is where the mount is done. > > > > > For educational purposes, can you point me the code part that uses newbuf so i can try to learn something? > sys/kern/vfs_bio.c Thanks I'm reading ! > > There are some sysctls you can look at. You'll get them by: > # sysctl -a | fgrep vfs | fgrep buffer root@bhyve01:~ # sysctl -a | fgrep vfs | fgrep buffer vfs.hifreebuffers: 5376 vfs.lofreebuffers: 3584 vfs.numfreebuffers: 105931 vfs.hidirtybuffers: 26502 vfs.lodirtybuffers: 13251 vfs.numdirtybuffers: 220 vfs.altbufferflushes: 0 vfs.dirtybufferflushes: 0 > # sysctl -a | fgrep bufspace root@bhyve01:~ # sysctl -a | fgrep bufspace vfs.bufspacethresh: 1681960548 vfs.hibufspace: 1725087744 vfs.lobufspace: 1638833353 vfs.maxmallocbufspace: 86254387 vfs.maxbufspace: 1735573504 vfs.bufspace: 502060032 vfs.runningbufspace: 0 > - Some of these can be adjusted. If you look in sys/kern/vfs_bio.c, > you can see which ones are CTLFLAG_RW. I've instrummented a collection of those values each 10s storing in a tsdb (/usr/sbin/prometheus_sysctl_exporter| grep vfs | grep buf), so we can track the values overtime. I still haven't got the mechanism, but what I think that makes sense to measure/watch is: - runningbufspace: it the number of outstanding requests grow a lot, can be a signal of stall - bufkvaspace - bufspace/maxbufspace = total usage of bufspace - bufmallocspace/maxmallocbufspace = total usage of malloced memory for buffers - bdwriteskip - numdirtybuffers: data not persisted to backing store - numfreebuffers - lofreebuffers - getnewbufrestarts - mappingrestarts - numbufallocfails - notbufdflushes > > Also, you can see exactly what the NFS mount setup is by: > # nfsstat -m > - If you post the output from this, I might be able to suggest > some mount option changes. As I said, I have two machines, they had the same config. When I started to have the problem I removed all the tuning and rolled back to nfsv3 in bhyve01. Sadly bhyve01 still hangs from time to time. I'm going to share nfsstat from both machines. root@bhyve01:~ # nfsstat -m 10.10.10.10:/mnt/datastore0/bhyve_instances on /vms nfsv3,tcp,resvport,nconnect=1,hard,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=1048576,wsize=1048576,readdirsize=1048576,readahead=1,wcommitsize=16777216,timeout=120,retrans=2 10.10.10.10:/mnt/datastore1/iso on /vms/.iso nfsv3,tcp,resvport,nconnect=1,hard,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=1048576,wsize=1048576,readdirsize=1048576,readahead=1,wcommitsize=16777216,timeout=120,retrans=2 10.10.10.10:/mnt/ds_ssd_vms_03/disks on /vms/.disks nfsv3,tcp,resvport,nconnect=1,hard,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=1048576,wsize=1048576,readdirsize=1048576,readahead=1,wcommitsize=16777216,timeout=120,retrans=2 root@bhyve02:~ # nfsstat -m 10.10.10.10:/mnt/datastore0/bhyve_instances on /vms nfsv4,minorversion=2,tcp,resvport,nconnect=16,hard,cto,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=1048576,wsize=1048576,readdirsize=1048576,readahead=8,wcommitsize=67108864,timeout=120,retrans=2147483647 10.10.10.10:/mnt/datastore1/iso on /vms/.iso nfsv4,minorversion=2,tcp,resvport,nconnect=16,hard,cto,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=1048576,wsize=1048576,readdirsize=1048576,readahead=8,wcommitsize=67108864,timeout=120,retrans=2147483647 10.10.10.10:/mnt/ds_ssd_vms_03/disks on /vms/.disks nfsv4,minorversion=2,tcp,resvport,nconnect=16,hard,cto,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=1048576,wsize=1048576,readdirsize=1048576,readahead=8,wcommitsize=67108864,timeout=120,retrans=2147483647 > > I do not know how bhyve reads/writes the image file? > (That might be a hint as well, since that is probably > what is unique about your setup.) > > rick > > > > > Ale > > > > On Mon, 1 Sep 2025 at 22:49 Rick Macklem <rick.macklem@gmail.com> wrote: > >> > >> For some reason, I cannot reply to your email > >> (might be the size of it), so I'll post a simple > >> comment. > >> > >> As you noted, processed are stuck on newbuf in > >> the client. This probably has nothing to do with > >> the server. It also looks like the clients are bhyve. > >> > >> Bump the memory size of the bhyve clients up, > >> maybe way up. > >> --> There are ways to tune the size of the buffer > >> cache, but bumping up the VM's ram should > >> give you more. > >> > >> rick