[RFC] Understanding the locking of struct buf

Sat Mar 13 12:59:26 UTC 2021

On Sat, Mar 13, 2021 at 01:03:59PM +0100, Alexander Lochmann wrote:
> 
> 
> On 13.03.21 04:30, Konstantin Belousov wrote:
> >> E.g. any read is permitted without a lock being held.
> >> Can b_bcount, for example, be read without a lock?
> > Sure you can read it without lock.  Question is, what do you intent to do
> > with this information.
> We're performing lock analysis on the FreeBSD kernel, and I want
> understand what kind of general assumptions are made.
> In the Linux kernel, for example, every word-sized value is considered
> to be read without a lock if consistency doesn't matter.
What consistency?  If you are talking about multithreading memory model
as expected by the FreeBSD kernel, look at atomic(9).  It has assumptions
like atomicity of naturally-aligned word-sized integer accesses written
out explicitly.

> In FreeBSD, do I have to use lock X in any case except Y and Z?
> Or is it the other way round: Do I need no lock at all except for case X
> and Y?
I do not understand this question(?).

> 
> > Are you reporting a bug or just asking about LK_KERNPROC. Lockmgr
> > (kern_lock.c)is careful enough to handle LK_KERNPROC owner specially. In
> > particular, it does not report unlock of such lock to witness.
> First of all, I want to understand how things in FreeBSD work.
> >From what I understand now: When setting up an asynchronous IO request,
> buf.b_lock is taken by thread X. Later on LK_KERNPROC is used to hand
> over the ownership to the kernel. The lock itself is still acquired.
The lock is acquired in the sense that nobody else can take the buffer'
lock until the call to lockmgr(LK_RELEASE).  But from this point, there
is no thread owning the lock.  Consider that the lock was converted to
the 1-counting semaphore.

> The completion of the IO job is performed in the kernel's context, which
> releases buf.b_lock in brelse().
The completion for async IO is performed in the context of some other
thread, typically either geom io up thread, or direct completion thread
of some disk driver. This is why this somewhat strange LK_KERNPROC
business is needed at all.

For sync IO, that thread only signals original thread that io completed.
In this case, no LK_KERNPROC trick is performed.

> So there is no explicit lock call in the latter context, is it?
No lock call, but there is an unlock.

> 
> However, I think there is indeed a call to WITNESS_UNLOCK() in
> __lockmgr_disown():
> https://github.com/freebsd/freebsd-src/blob/main/sys/kern/kern_lock.c#L1641
> 
> For the following stack trace, we observe several writes to buf.b_bcount
> without any lock held.
> xpt_done_td
> xpt_done_process
> adadone
> g_disk_done
> g_io_devliver
> g_std_done
> g_io_deliver
> g_std_done
> g_io_deliver
> g_vfs_done
> bufdone
> ffs_backgroundwritedone
> bufdine
> brelse
Assuming you wrote the stack bottom-up, this is exactly what I wrote above:
xpt_done_td is CAM IO completion thread, and it performs actions after hw
informed that io request (bio) was completed.

> allocbuf
When brelse() notes that buffer was marked as 'no cache', it demolishes
the buffer right after async io finishes.  Perhaps this is the case that
you observed.

> 
> There are several reasons for those observations:
> a) Due to the call to WITNESS_UNLOCK() in __lockmgr_disown(), which we
> instrumented, our approach assumes no lock is held.
> b) Our instrumentation missed a lock call.
> c) Down the path describes above, no locks are held at all.
> d) Something else....
> Can you pls shed some light on our observation?
> 
> - Alex
> 
> -- 
> Technische Universität Dortmund
> Alexander Lochmann                PGP key: 0xBC3EF6FD
> Otto-Hahn-Str. 16                 phone:  +49.231.7556141
> D-44227 Dortmund                  fax:    +49.231.7556116
> http://ess.cs.tu-dortmund.de/Staff/al