Does msodsfs_readdir() require a exclusively locked vnode

Wed Jul 27 17:56:10 UTC 2011

John Baldwin wrote:
> On Tuesday, July 26, 2011 9:56:37 pm Attilio Rao wrote:
> > 2011/7/26 Kostik Belousov <kostikbel at gmail.com>:
> > > On Tue, Jul 26, 2011 at 10:07:28AM -0400, Rick Macklem wrote:
> > >> Kostik Belousov wrote:
> > >> > On Mon, Jul 25, 2011 at 07:22:40PM -0400, Rick Macklem wrote:
> > >> > > Hi,
> > >> > >
> > >> > > Currently both NFS servers set the vnode lock LK_SHARED
> > >> > > and so do the local syscalls (at least that's how it looks
> > >> > > by inspection?).
> > >> > >
> > >> > > Peter Holm just posted me this panic, where a test for an
> > >> > > exclusive vnode lock fails in msdosfs_readdir().
> > >> > > KDB: stack backtrace:
> > >> > >
> db_trace_self_wrapper(c0efa6f6,c71627f8,c79230b0,c0f2ef29,f19154b8,...)
> > >> > > at db_trace_self_wrapper+0x26
> > >> > > kdb_backtrace(c7f20b38,f19154fc,c0d586d5,f191550c,c7f20ae0,...)
> > >> > > at
> > >> > > kdb_backtrace+0x2a
> > >> > > vfs_badlock(c101b180,f191550c,c1055580,c7f20ae0) at
> > >> > > vfs_badlock+0x23
> > >> > > assert_vop_elocked(c7f20ae0,c0ee5f4f,c09f3213,8,0,...) at
> > >> > > assert_vop_elocked+0x55
> > >> > > pcbmap(c7966e00,0,f191560c,f1915618,f191561c,...) at
> > >> > > pcbmap+0x45
> > >> > > msdosfs_readdir(f1915960,c0f4b343,c7f20ae0,f1915940,0,...) at
> > >> > > msdosfs_readdir+0x528
> > >> > > VOP_READDIR_APV(c101b180,f1915960,2,f1915a68,c7923000,...) at
> > >> > > VOP_READDIR_APV+0xc5
> > >> > > nfsrvd_readdir(f1915b64,0,c7f20ae0,c7923000,f1915a68,...) at
> > >> > > nfsrvd_readdir+0x38e
> > >> > > nfsrvd_dorpc(f1915b64,0,c7923000,c842a200,4,...) at
> > >> > > nfsrvd_dorpc+0x1f79
> > >> > > nfssvc_program(c7793800,c842a200,c0f24d67,492,0,...) at
> > >> > > nfssvc_program+0x40f
> > >> > > svc_run_internal(f1915d14,c09d9a98,c73dfa80,f1915d28,c0ef1130,...)
> > >> > > at svc_run_internal+0x952
> > >> > > svc_thread_start(c73dfa80,f1915d28,c0ef1130,3a5,c7e4b2c0,...)
> > >> > > at
> > >> > > svc_thread_start+0x10
> > >> > > fork_exit(c0bed7d0,c73dfa80,f1915d28) at fork_exit+0xb8
> > >> > > fork_trampoline() at fork_trampoline+0x8
> > >> > > --- trap 0x804c12e, eip = 0xc, esp = 0x33, ebp = 0x1 ---
> > >> > > pcbmap: 0xc7f20ae0 is not exclusive locked but should be
> > >> > > KDB: enter: lock violation
> > >> > >
> > >> > > So, does anyone know if the msdosfs_readdir() really requires
> > >> > > a
> > >> > > LK_EXCLUSIVE
> > >> > > locked vnode or is the ASSERT_VOP_ELOCKED() too strong in
> > >> > > pcbmap()?
> > >> >
> > >> > Yes, msdosfs currently requires all vnode locks to be
> > >> > exclusive. One
> > >> > of
> > >> > the reasons is that each denode (the msdosfs-private vnode
> > >> > data)
> > >> > carries
> > >> > the fat entries cache, and this cache is updated even by the
> > >> > operations
> > >> > that do not modify vnode from the VFS POV.
> > >> >
> > >> > The locking regime is enforced by the getnewvnode()
> > >> > initializing the
> > >> > vnode
> > >> > lock with LK_NOSHARE flag, and msdosfs code not calling
> > >> > VN_LOCK_ASHARE()
> > >> > on the newly instantiated vnode.
> > >> >
> > >> > My question is, was the vnode in question locked at all ?
> > >> I think the problem is that I do a LK_DOWNGRADE. From a quick
> > >> look at __lockmgr_args(), it doesn't check LK_NOSHARE for a
> > >> LK_DOWNGRADE.
> > >>
> > >> Maybe __lockmgr_args() should have something like:
> > >>    if (op == LK_DOWNGRADE && (lk->lock_object.lo_flags &
> > >>    LK_NOSHARE))
> > >>         return (0); /* noop */
> > >> after the
> > >>    if (op == LK_SHARED && (lk->lock_object.lo_flags &
> > >>    LK_NOSHARE))
> > >>         op = LK_EXCLUSIVE;
> > >> lines?
> > > The RELENG_7 lockmgr does not check the NOSHARE flag on downgrade,
> > > but I agree with the essence of your proposal.
> >
> > As long as the difference in semantic with the old lockmgr is
> > correctly stressed out in the doc (and eventually comments) I'm fine
> > with this change.
> 
> I think it is a bug in the LK_NOSHARE implementation if the old
> lockmgr()
> didn't silently nop downgrade requests when LK_NOSHARE was set. :) We
> should definitely fix it to ignore downgrades for LK_NOSHARE.
> 
By the way, I think that __lockmgr_args() in -current doesn't check for
LK_NOSHARE. That was what pho@ was testing when he found the problem.

At this point, I believe that the new NFS server (which I have a patch for
that pho@ is testing to avoid LK_DOWNGRADE) is the only place that is
broken. (compute_cn_lkflags() only sets LK_SHARED if MNT_LOOKUP_SHARED
is set and the only LK_DOWNGRADE I see is in vfs_cache.c when
cn_lkflags == LK_SHARED. The rest are in file systems that handle LK_SHARED
locked vnodes, from what I can see at a glance.)

So, it isn't a difference between old/current behaviour, just a suggestion
that adding a check in __lockmgr_args() might be a nice safety belt for
the future, since __lockargs_mgr() already checks for the LK_SHARED case.

rick, who will get the fix for the new NFS server to re@ soon