Re: nfs hang
- Reply: Ronald Klop : "Re: nfs hang"
- In reply to: Ronald Klop : "Re: nfs hang"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 13 Nov 2025 13:06:58 UTC
On Thu, Nov 13, 2025 at 2:45 AM Ronald Klop <ronald@freebsd.org> wrote: > > Op 13-11-2025 om 11:41 schreef Ronald Klop: > > Hi, > > > > I have setup nfsd in a jail. It exports zfs fs. The kernel is 16-CURRENT/aarch64. Jails are 14.3-RELEASE. > > $ cat /data/jails/pkg/_root/etc/exports > > V4: / -sec=sys > > > > /usr/local/poudriere/data/logs/bulk -sec=sys -maproot=root > > /usr/local/poudriere/data/packages -sec=sys -maproot=root > > > > /usr/ports -sec=sys > > > > > > The clients run poudriere in jails. > > > > Now and than I get hanging processes and unresponsive nfs server messages. > > > > All NFS threads are in this state: > > [root@rpi4 ~]# procstat -kk 5973 > > PID TID COMM TDNAME KSTACK > > 5973 100541 nfsd nfsd: master mi_switch+0x100 sleepq_catch_signals+0x3e4 sleepq_timedwait_sig+0x18 _sleep+0x1a0 clnt_vc_call+0x814 clnt_reconnect_call+0x960 newnfs_request+0xacc nfsrpc_closerpc+0xfc nfscl_tryclose+0x58 nfsrpc_doclose+0x294 nfscl_doclose+0x390 nfsrpc_close+0x28 ncl_inactive+0x14c vop_sigdefer+0x34 vinactivef+0xb8 vput_final+0x1f4 null_reclaim+0x1a0 VOP_RECLAIM_APV+0x20 > > 5973 100784 nfsd nfsd: service mi_switch+0x100 sleepq_catch_signals+0x3e4 sleepq_timedwait_sig+0x18 _sleep+0x1a0 clnt_vc_call+0x814 clnt_reconnect_call+0x960 newnfs_request+0xacc nfsrpc_closerpc+0xfc nfscl_tryclose+0x58 nfsrpc_doclose+0x294 nfscl_doclose+0x390 nfsrpc_close+0x28 ncl_inactive+0x14c vop_sigdefer+0x34 vinactivef+0xb8 vput_final+0x1f4 null_reclaim+0x1a0 VOP_RECLAIM_APV+0x20 > > 5973 100785 nfsd nfsd: service mi_switch+0x100 sleepq_catch_signals+0x3e4 sleepq_timedwait_sig+0x18 _sleep+0x1a0 clnt_vc_call+0x814 clnt_reconnect_call+0x960 newnfs_request+0xacc nfsrpc_closerpc+0xfc nfscl_tryclose+0x58 nfsrpc_doclose+0x294 nfscl_doclose+0x390 nfsrpc_close+0x28 ncl_inactive+0x14c vop_sigdefer+0x34 vinactivef+0xb8 vput_final+0x1f4 null_reclaim+0x1a0 VOP_RECLAIM_APV+0x20 > > 5973 100786 nfsd nfsd: service mi_switch+0x100 sleepq_catch_signals+0x3e4 sleepq_timedwait_sig+0x18 _sleep+0x1a0 clnt_vc_call+0x814 clnt_reconnect_call+0x960 newnfs_request+0xacc nfsrpc_closerpc+0xfc nfscl_tryclose+0x58 nfsrpc_doclose+0x294 nfscl_doclose+0x390 nfsrpc_close+0x28 ncl_inactive+0x14c vop_sigdefer+0x34 vinactivef+0xb8 vput_final+0x1f4 null_reclaim+0x1a0 VOP_RECLAIM_APV+0x20 > > ... and a couple more similar lines ... > > > > In rc.conf: > > nfs_server_enable=YES > > mountd_enable=YES > > nfsv4_server_only=YES > > nfs_server_flags="-t" > > > > The filesystems are a zfs legacy mount in the jail: > > # grep zfs /data/jails/pkg/fstab > > zrpi4/data/poudriere-logs-bulk /data/jails/pkg/_root/usr/local/poudriere/data/logs/bulk zfs rw 0 0 > > zrpi4/data/poudriere-packages /data/jails/pkg/_root/usr/local/poudriere/data/packages zfs rw 0 0 > > zdata4/ports /data/jails/pkg/_root/usr/ports zfs rw 0 0 > > > > > > Interestingly I also have a bash process hanging which should not access NFS at the moment: > > # procstat -kk 83175 > > PID TID COMM TDNAME KSTACK > > 83175 111203 bash - mi_switch+0x100 sleeplk+0xf8 lockmgr_slock_hard+0x29c _vn_lock+0x50 vget_finish+0x28 cache_fplookup_final_child+0x54 cache_fplookup+0x538 namei+0xd8 kern_statat+0xd4 sys_fstatat+0x2c do_el0_sync+0x6b4 handle_el0_sync+0x4c > > > > Any thoughts? > > > > Regards, > > Ronald. > > > > > > > Just noticed this on the console: > newnfs: server 'pkg.thuis.klop.ws' error: fileid changed. fsid b7c1283e:5166a2de: expected fileid 0x22, got 0x2. (BROKEN NFS SERVER OR MIDDLEWARE) > newnfs: server 'pkg.thuis.klop.ws' error: fileid changed. fsid 3750dc87:289259de: expected fileid 0x22, got 0x2. (BROKEN NFS SERVER OR MIDDLEWARE) > newnfs: server 'pkg.thuis.klop.ws' error: fileid changed. fsid 263bc0f2:d39c94de: expected fileid 0x22, got 0x2. (BROKEN NFS SERVER OR MIDDLEWARE) > newnfs: server 'pkg.thuis.klop.ws' error: fileid changed. fsid b7c1283e:5166a2de: expected fileid 0x22, got 0x2. (BROKEN NFS SERVER OR MIDDLEWARE) > newnfs: server 'pkg.thuis.klop.ws' error: fileid changed. fsid 3750dc87:289259de: expected fileid 0x22, got 0x2. (BROKEN NFS SERVER OR MIDDLEWARE) > newnfs: server 'pkg.thuis.klop.ws' error: fileid changed. fsid 263bc0f2:d39c94de: expected fileid 0x22, got 0x2. (BROKEN NFS SERVER OR MIDDLEWARE) > newnfs: server 'pkg.thuis.klop.ws' error: fileid changed. fsid b7c1283e:5166a2de: expected fileid 0x22, got 0x2. (BROKEN NFS SERVER OR MIDDLEWARE) > newnfs: server 'pkg.thuis.klop.ws' error: fileid changed. fsid 3750dc87:289259de: expected fileid 0x22, got 0x2. (BROKEN NFS SERVER OR MIDDLEWARE) > newnfs: server 'pkg.thuis.klop.ws' error: fileid changed. fsid b7c1283e:5166a2de: expected fileid 0x22, got 0x2. (BROKEN NFS SERVER OR MIDDLEWARE) > newnfs: server 'pkg.thuis.klop.ws' error: fileid changed. fsid 3750dc87:289259de: expected fileid 0x22, got 0x2. (BROKEN NFS SERVER OR MIDDLEWARE) > newnfs: Logged 10 times about fileid corruption; going quiet to avoid spamming logs excessively. (Limit is: 10). Do you have more than one client mounting the file system? If you do, make sure they all have different /etc/hostid's. (Cloning a system disk without deleting /etc/hostid can result in multiple clients with the same /etc/hostid. That mean they are "the same client" to the NFSv4 server and that can cause the above.) If this is not the problem, I don't know why you'd see the above but I suspect the above explains the hang. rick > >