msleep() on recursivly locked mutexes

Wed May 2 08:35:29 UTC 2007

Matthew Dillon wrote:
>     The real culprit here is passing held mutexes to unrelated procedures
>     in the first place because those procedures might have to block, in
>     order so those procedures can release and reacquire the mutex.
>     That's just bad coding in my view.  The unrelated procedure has no
>     clue as to what the mutex is or why it is being held and really has no
>     business messing with it.
> 
>     What I did was implement spinlocks with VERY restricted capabilities,
>     far more restricted then the capabilities of your mutexes.  Our
>     spinlocks are meant only to be used to lock up tiny pieces of code
>     (like for ref counting or structural or flag-changing operations).
>     Plus the kernel automatically acts as if it were in a critical section
>     if it takes an interrupt while the current thread is holding a spinlock.
>     That way mainline code can just use a spinlock to deal with small bits
>     of interlocked information without it costing much in the way of
>     overhead.

Well, this is currently what our spinmutexes do too.
The couplet mtx_lock_spin()/mtx_unlock_spin() simply starts/exits a 
critical section (disabling interrupts in the while and avoiding 
preemption). They are intended to be used for very small pieces of code too.

>     I made the decision that ANYTHING more complex then that would have to
>     use a real lock, like a lockmgr lock or a token, depending on the
>     characteristics desired.  To make it even more desireable I also 
>     stripped down the lockmgr() lock implementation, removing numerous
>     bits that were inherited from very old code methodologies that have no
>     business being in a modern operating system, like LK_DRAIN.  And I
>     removed the passing of an interlocking spinlock to the lockmgr code,
>     because that methodology was being massively abused in existing code
>     (and I do mean massively).

Well, if you add a more smart interface, you have *exactly* our sx locks 
implementation.
Basically, sx and lockmgr in FreeBSD just differs beacause of the 
lockmgr's stupid API, beacause of draining and beacause of interlock. 
But they are basically very very similar*.

>     I'm not quite sure what the best way to go is for FreeBSD, because
>     you guys have made your mutexes just as or even more sophisticated
>     then your normal locks in many respects, and you have like 50 different
>     types of locks now (I can't keep track of them all).

Not sure what you mean with 'more sophisticated'... anyways...
The only one problem I currently see with our locking primitives is that 
they are not very well documented (or part of the documentation is 
stale) and this can be a problem when there are a couple of locking 
primitives as we have but this doesn't mean that they are complex. 
Really, any primitive is very simple and is thought to be used in its 
particular context. The restriction we have on locks just are a sort of 
warning for people developing wrong locking strategies.
For example, there are not tecnological difficulties in allowing holding 
mutexes when sleeping but if this really happen, probabilly there is a 
problem in your locking scheme.

>     If I were to offer advise it would be: Just stop trying to mix water
>     and hot wax.  Stop holding mutexes across potentially blocking procedure
>     calls.  Stop passing mutexes into unrelated bits of code in order for
>     them to be released and reacquired somewhere deep in that code.  Just
>     doing that will probably solve all of the problems being reported.

I cannot understand what part of the codes you are referring with this...

Thanks,
Attilio

* Another difference is about upgrading, but I consider FreeBSD's 
lockmgr upgrading a really bad choice of design, and world could be a 
very better place without it