msleep() on recursivly locked mutexes

Fri Apr 27 18:10:33 UTC 2007

On Friday 27 April 2007 19:39, Matthew Dillon wrote:
>     The real culprit here is passing held mutexes to unrelated procedures
>     in the first place because those procedures might have to block, in
>     order so those procedures can release and reacquire the mutex.
>     That's just bad coding in my view.  The unrelated procedure has no
>     clue as to what the mutex is or why it is being held and really has no
>     business messing with it.
>
>     What I did was implement spinlocks with VERY restricted capabilities,
>     far more restricted then the capabilities of your mutexes.  Our
>     spinlocks are meant only to be used to lock up tiny pieces of code
>     (like for ref counting or structural or flag-changing operations).
>     Plus the kernel automatically acts as if it were in a critical section
>     if it takes an interrupt while the current thread is holding a
> spinlock. That way mainline code can just use a spinlock to deal with small
> bits of interlocked information without it costing much in the way of
> overhead.
>
>     I made the decision that ANYTHING more complex then that would have to
>     use a real lock, like a lockmgr lock or a token, depending on the
>     characteristics desired.  To make it even more desireable I also
>     stripped down the lockmgr() lock implementation, removing numerous
>     bits that were inherited from very old code methodologies that have no
>     business being in a modern operating system, like LK_DRAIN.  And I
>     removed the passing of an interlocking spinlock to the lockmgr code,
>     because that methodology was being massively abused in existing code
>     (and I do mean massively).
>
>     I'm not quite sure what the best way to go is for FreeBSD, because
>     you guys have made your mutexes just as or even more sophisticated
>     then your normal locks in many respects, and you have like 50 different
>     types of locks now (I can't keep track of them all).
>
>     If I were to offer advise it would be: Just stop trying to mix water
>     and hot wax.  Stop holding mutexes across potentially blocking
> procedure calls.  Stop passing mutexes into unrelated bits of code in order
> for them to be released and reacquired somewhere deep in that code.  Just
> doing that will probably solve all of the problems being reported.

Maybe one should put some characters into blocking and non-blocking function 
names so that the compiler can check if non-blocking functions call blocking 
functions.

Doing everything blocking will make the system work, but suddently it can 
dead-lock, when two threads are waiting for eachother, and you will have zero 
clue about it.

I like the current FreeBSD. It's a brain exercise and it clearly shows that 
MPSAFE/Giant-free is quality mark, and not something you acchieve by adding 
some lock/unlock statements here and there. Sometimes it means you have to 
completely rewrite code. And that is the most difficult part to convince 
other developers about. Been there done that :-)

--HPS