cvs commit: src/share/man/man9 locking.9 rmlock.9 src/sys/conf
files src/sys/kern kern_rmlock.c subr_lock.c subr_pcpu.c
subr_smp.c src/sys/sys _rmlock.h lock.h pcpu.h rmlock.h smp.h
rwatson at FreeBSD.org
Mon Nov 26 02:31:43 PST 2007
On Mon, 26 Nov 2007, Daniel Eischen wrote:
>> Spin and blocking mutexes should in my opinion be defined as different
>> structures, at least in name so that the compiler hits you with a clue-bat
>> when you try use a spin-lock with non-spinlock ops etc.
> That seems nice, but doesn't go along well with trying to keep a similar API
> as Solaris. It is convenient to share the APIs so that it is easy to change
> the mtx_init() from a default to a spin type without changing the rest of
> the code. We really shouldn't have a need for spin mutexes if the default
> mutexes are adaptive, and if mutexes taken by interrupt handlers are
> initialized in a manner similar to Solaris. There really shouldn't be a
> separate API for spin mutexes. Once a mutex is initialized as MTX_DEF or
> MTX_SPIN, that should be sufficient.
While I agree from a code hacking perspective, one of the benefits of having
different types and APIs in the code is that you can have more rich static
checking without having to try and manage the state associated with flags at
init-time. For example, maintaining the MTX_SPIN/MTX_DEF state in Coverity
would be difficult or impossible, because global context isn't generally
available, but by using different types locally in analyzed code, you can have
more rigorous checking. The "classic" example is with rwlocks/mutexes and
sxlocks and the check for unbounded sleeping: it's ok to tsleep/msleep while
holding an sxlock, but not while holding a mutex or rwlock unless it's the
argument to the sleep call, and hence implicitly dropped. If we combined all
the types, we couldn't enforce that statically, but by using different types,
local static analysis can flag the sleep attempt. In much code in the kernel,
we accomplish lock "pluggability" by using macros rather than by sharing data
types, allowing changes at compile-time but not run-time, and this appears to
work well in practice.
>> not sure why sx-locks exist at all, as they seem to be a variant of sleep.
>> I think it's just a convenience function set to allow one to implement a
>> sleep-derived synchronisation.
> Hmm, sx locks seem similar to rmlocks, at least according to the
> Shared/exclusive locks are used to protect data that are read far
> more often than they are written.
> Do we need both?
rmlocks are most similar to rwlocks, as they have priority propogation and
permit only bounded sleeping. There's some argument that we might want to
realign the rwlock API to match the rmlock API to make them more easily
A really important but less well-understood element of our kernel locking
model is the careful distinction made between bounded and unbounded sleeps.
For the core "sleeping" synchronization primitives -- default mutexes,
rwlocks, and rmlocks, it's permitted only to acquire locks in this class and
not to perform potentially unbounded I/O waits while holding them. sx locks
and the increasingly decrepid lockmgr locks, as well as custom constructions
that may still persist involving msleep and condition variables, may be held
over unbounded I/O waiting, such as page faults, disk I/O, socket I/O, etc.
The former class of locks necessarily always falls after the latter class of
locks in the global lock order, and we don't permit acquisition of the
unbounded sleep locks or other unbounded sleep primitives in sensitive
contexts, such as in interrupt threads, software interrupt threads, etc.
This prevents a broad class of deadlocks from even being expressed -- hence
the recent issues with ipf and pf holding non-sleep able (mutex/rwlock/rmlock)
primitives over copyin/copyout, which is forbidden by witness because those
locks are intended to be shared between sensitive and less sensitive contexts.
Just to illustrate how this plays out in practice: each socket buffer has two
locks -- the socket buffer mutex and the socket buffer sx lock (historically a
mutex-interlocked msleep lock). The socket buffer mutex maintains the basic
consistency of the data structure, and may be acquired from ithreads/netisr
and the user thread accessing the socket buffer. That mutex is sufficient to
access and (in general although not in one specific case) modify the data
structure. The socket buffer sx lock, on the other hand, is about serializing
I/O operations to prevent undesired I/O interlacing during simultaneous
requests from user space, and is acquired only from user threads. Because a
single I/O may span many user data pages, and we don't want interlacing of
those I/Os, we acquire it at the beginning of a particular send() call, and
drop it at the end only once the complete operation has finished, so that you
don't get interlacing from other threads sending on the same socket when the
first threads priority drops, it faults on a page, or the like. This is not
disimilar to the pair of locks present on a vnode -- the vnode interlock mutex
(low level structure consistency) and the vnode lock (a lockmgr lock
protecting consistency over I/O).
Robert N M Watson
University of Cambridge
More information about the cvs-src