Re: Various unprotected accesses to buf and vnode

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Tue, 31 Aug 2021 14:44:42 UTC
On Tue, Aug 31, 2021 at 10:59:26AM +0200, Alexander Lochmann wrote:
> On 29.08.21 00:29, Konstantin Belousov wrote:
> > Ok, I see some call sequences (?), but again all of them are ffs_write()
> > (one is ext2_write) calling into cluster_write().  There the buffer lock
> > is owned.
> > 
> > Show me the specific call sequence where it is not.
> Who owns the buffer lock at that point? Has its ownership been
> transferred to the kernel?
> Do you know where the buffer lock is acquired?
> 
> According to our data, the buffer lock of the current accessed buffer is
> not owned. Otherwise, there would an entry like this
> 'EMBSAME(buf.b_lock[w])'.
> So in all of those call sequences the buffer lock is not acquired.
> However, I'd not rule out that our tooling could be broken as well.
Buffer is locked inside UFS_BALLOC(), which returns it to the ffs_write()
use.

> 
> > Ah, yes, the calls from lookup and open would be with the shared lock.
> > Still, we lock the vnode interlock to avoid double-allocating the v_object
> > object, so it is fine.  Some mode of the vnode lock is required nonetheless,
> > because otherwise we might miss reclaim which guarantees that v_object 
> > is freed.
> > 
> I see. Does this rule apply to all fields for which the vnode lock is
> the designated lock?
Which rule?

We just need to provide some mutual exclusion rules, while satisfying the
requirements of the lock hierarchy ordering to avoid deadlock.  If we
can guarantee that specific field is always modified while owning the
vnode lock exclusively, fine.  If we only own the vnode lock shared,
typically vnode interlock provides the exclusivity for access.

Read e.g. sys/ufs/ufs/inode.h gerald comment above struct inode definition.
It provides more detailed exposure.

> From a different angle:
> The documentation says about bo_object: ''v' is the vnode lock which
> embeds the bufobj.'.
> Does 'the vnode lock' mean a specific lock, or a group of locks?

Vnode lock is a lock obtained with vn_lock().  It is up to filesystem
to implement VOP_LOCK() which locks the vnode.

Default VOP_LOCK() locks vp->v_vnlock, which again by default points
to &vp->v_lock.

There are several special cases.  For instance, for FFS and snapshot vnodes,
v_vnlock points to the snapdata->sn_lock (snapdata is unique per FFS mount).
For nullfs non-reclaimed vnodes, v_vnlock points to the lower vnode lock.