Re: git: 8ef0c11e7ce7 - main - nfsclient: upgrade vnode lock in VOP_OPEN()/VOP_CLOSE() if we need to flush buffers

From: Peter Jeremy <peter_at_rulingia.com>
Date: Wed, 24 Nov 2021 02:19:18 UTC
On 2021-Nov-23 11:19:21 +0200, Konstantin Belousov <kostikbel@gmail.com> wrote:
>On Tue, Nov 23, 2021 at 07:09:08PM +1100, Peter Jeremy wrote:
>> On 2021-Nov-16 17:14:55 +0000, Konstantin Belousov <kib@FreeBSD.org> wrote:
>> >    nfsclient: upgrade vnode lock in VOP_OPEN()/VOP_CLOSE() if we need to flush buffers
>> >    
>> >    VOP_FSYNC() asserts that the vnode is exclusively locked for NFS.
>> >    If we try to execute file with recently modified content, the assert is
>> >    triggered.
>> 
>> I have a diskless arm64 system configured with swap over NFS and I'm
>> now consistently getting a panic during shutdown.  I haven't
>> specificially confirmed that it's this commit but the content is
>> suggestive.
>> 
>> panic: upgrade of unlocked lock (lockmgr) nfs @ /usr/src/sys/fs/nfsclient/nfs_clvnops.c:855
>> cpuid = 3
>> time = 1637166551
>> KDB: stack backtrace:
>> db_trace_self() at db_trace_self
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
>> vpanic() at vpanic+0x178
>> panic() at panic+0x44
>> witness_upgrade() at witness_upgrade+0x104
>> lockmgr_upgrade() at lockmgr_upgrade+0x164
>> nfs_lock() at nfs_lock+0x2c
>> vop_sigdefer() at vop_sigdefer+0x30
>> _vn_lock() at _vn_lock+0x54
>> nfs_close() at nfs_close+0xc8
>> vop_sigdefer() at vop_sigdefer+0x30
>> VOP_CLOSE_APV() at VOP_CLOSE_APV+0x2c
>> swapdev_close() at swapdev_close+0x3c
>> swapoff_one() at swapoff_one+0x598
>> sys_swapoff() at sys_swapoff+0x12c
>> do_el0_sync() at do_el0_sync+0x498
>> handle_el0_sync() at handle_el0_sync+0x90
>> --- exception, esr 0x56000000
>> 
>> I presume this isn't intended.  Can you suggest where I should start
>> looking for the problem?
>
>Try this please.  It might be also useful to enable DEBUG_VFS_LOCKS in your
>kernel config, to catch all related issues once.

Thanks for the rapid response.  I've found that DEBUG_VFS_LOCKS also
requires INVARIANTS.  That now catches swapon as well:
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x30
assert_vop_locked() at assert_vop_locked+0x58
VOP_GETATTR_APV() at VOP_GETATTR_APV+0x4c
sys_swapon() at sys_swapon+0x2c0
do_el0_sync() at do_el0_sync+0x4a4
handle_el0_sync() at handle_el0_sync+0x90
--- exception, esr 0x56000000
vnode 0xffffa000065511c0: type VREG
    usecount 1, writecount 0, refcount 1 seqc users 0
    hold count flags ()
    flags (VV_VMSIZEVNLOCK)
    lock type nfs: UNLOCKED
        fileid 30984 fsid 0x3a3a00ff01
VOP_GETATTR: 0xffffa000065511c0 is not locked but should be
KDB: enter: lock violation
[ thread pid 1027 tid 100159 ]
Stopped at      kdb_enter+0x48: undefined       f900c11f

I will dig into that further this evening.

-- 
Peter Jeremy