Re: nfs: panic: fsync: vnode is not exclusive locked but should be

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Wed, 27 Aug 2025 16:59:14 UTC
On Wed, Aug 27, 2025 at 01:18:32PM +0000, Bjoern A. Zeeb wrote:
> Just netbooted a main (0d843cc2e2a373f01f) GENERIC to do some testing on
> a board I rarely use before pushing changes and got the below.
> Has this been fixed already?
> 
> ...
> Last login: Sun May 25 19:37:45 on ttyu0
> VNASSERT failed: locked not true at /usr/src/bz_wifi_precommit_testing/sys/kern/vfs_subr.c:5795 (assert_vop_elocked)
> 0xfffff8000761b000: type VREG state VSTATE_CONSTRUCTED op 0xffffffff81aad768
>     usecount 2, writecount 0, refcount 3 seqc users 0
>     hold count flags ()
>     flags (VV_VMSIZEVNLOCK|VMP_LAZYLIST)
>     v_object 0xfffff80005d991f0 ref 0 pages 1 cleanbuf 0 dirtybuf 1
>     lock type nfs: SHARED (count 1)
> Aug 27 13:16:38         fapu2e4b login[19ileid 18620252 fsid 0x3a3a00ff01
> panic: fsync: vnode is not exclusive locked but should be
> cpuid = 3
> time = 1756300598
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0067d3a4c0
> vpanic() at vpanic+0x136/frame 0xfffffe0067d3a5f0
> panic() at panic+0x43/frame 0xfffffe0067d3a650
> vop_fsync_debugprepost() at vop_fsync_debugprepost+0x124/frame 0xfffffe0067d3a690
> VOP_FSYNC_APV() at VOP_FSYNC_APV+0x23/frame 0xfffffe0067d3a6b0
> bufsync() at bufsync+0x3b/frame 0xfffffe0067d3a6e0
> bufobj_invalbuf() at bufobj_invalbuf+0x24f/frame 0xfffffe0067d3a740
> ncl_vinvalbuf() at ncl_vinvalbuf+0x100/frame 0xfffffe0067d3a7b0
> nlm_advlock_internal() at nlm_advlock_internal+0xa7/frame 0xfffffe0067d3aaf0
> nlm_advlock() at nlm_advlock+0x2d/frame 0xfffffe0067d3ab10
> nfs_advlock() at nfs_advlock+0x1d0/frame 0xfffffe0067d3ac30
> vop_sigdefer() at vop_sigdefer+0x30/frame 0xfffffe0067d3ac60
> VOP_ADVLOCK_APV() at VOP_ADVLOCK_APV+0x51/frame 0xfffffe0067d3ac90
> vn_closefile() at vn_closefile+0x9a/frame 0xfffffe0067d3ad10
> _fdrop() at _fdrop+0x1a/frame 0xfffffe0067d3ad30
> closef() at closef+0x1e3/frame 0xfffffe0067d3adc0
> closefp_impl() at closefp_impl+0x71/frame 0xfffffe0067d3ae00
> amd64_syscall() at amd64_syscall+0x169/frame 0xfffffe0067d3af30
> fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0067d3af30
> --- syscall (6, FreeBSD ELF64, close), rip = 0x1ea0dba28cca, rsp = 0x1ea0d6def6b8, rbp = 0x1ea0d6def6e0 ---
> KDB: enter: panic
> [ thread pid 1955 tid 100182 ]
> Stopped at      kdb_enter+0x33: movq    $0,0x1222d22(%rip)
> db> show alllocks
> Process 1955 (login) thread 0xfffff80005dda000 (100182)
> exclusive lockmgr nfsupg (nfsupg) r = 0 (0xfffffe006b1557e0) locked @ /usr/src/bz_wifi_precommit_testing/sys/fs/nfsclient/nfs_clsubs.c:146
> shared lockmgr nfs (nfs) r = 0 (0xfffff8000761b070) locked @ /usr/src/bz_wifi_precommit_testing/sys/fs/nfsclient/nfs_clvnops.c:3477
> Process 1817 (syslogd) thread 0xfffff80005d85780 (100108)
> exclusive lockmgr nfs (nfs) r = 0 (0xfffff800447a33e0) locked @ /usr/src/bz_wifi_precommit_testing/sys/kern/vfs_vnops.c:1243
> 

You are using nfslockd, right?
Try this.

commit 881d724a671caa628407373faf0b87a70bfb3218
Author: Konstantin Belousov <kib@FreeBSD.org>
Date:   Wed Aug 27 19:57:06 2025 +0300

    nfs client: switch nfs_advlock() to use exclusive vnode lock
    
    It eliminates the need to upgrade the lock in the function.
    More importantly, the calls to nfs_advlock_p()/nlm_advlock() sometimes
    flush buffers, which requires exclusive locking.
    
    Reported by:    bz

diff --git a/sys/fs/nfsclient/nfs_clvnops.c b/sys/fs/nfsclient/nfs_clvnops.c
index a8b06fdb261b..eee571a04821 100644
--- a/sys/fs/nfsclient/nfs_clvnops.c
+++ b/sys/fs/nfsclient/nfs_clvnops.c
@@ -3474,7 +3474,7 @@ nfs_advlock(struct vop_advlock_args *ap)
 	u_quad_t size;
 	struct nfsmount *nmp;
 
-	error = NFSVOPLOCK(vp, LK_SHARED);
+	error = NFSVOPLOCK(vp, LK_EXCLUSIVE);
 	if (error != 0)
 		return (EBADF);
 	nmp = VFSTONFS(vp->v_mount);
@@ -3511,11 +3511,6 @@ nfs_advlock(struct vop_advlock_args *ap)
 			cred = p->p_ucred;
 		else
 			cred = td->td_ucred;
-		NFSVOPLOCK(vp, LK_UPGRADE | LK_RETRY);
-		if (VN_IS_DOOMED(vp)) {
-			error = EBADF;
-			goto out;
-		}
 
 		/*
 		 * If this is unlocking a write locked region, flush and