msleep() on recursivly locked mutexes

Julian Elischer julian at elischer.org
Fri Apr 27 17:41:20 UTC 2007


Hans Petter Selasky wrote:
> On Thursday 26 April 2007 23:50, Attilio Rao wrote:
>> 2007/4/26, Julian Elischer <julian at elischer.org>:
>>> The reason that mutexes ever recurse in the first place is usually
>>> because one piece of code calls itself (or a related piece of code) in a
>>> blind manner.. in other words, it doesn't know it is doing so.  The whole
>>> concept of recursing mutexes is a bit gross. The concept of blindly
>>> unwinding them is I think just asking for trouble.
>>>
>>> Further the idea that holding a mutex "except for when we sleep" is a
>>> generally bright idea is also a bit odd to me.. If you hold a mutex and
>>> release it during sleep you probably should invalidate all assumptions
>>> you made during the period before you slept as whatever you were
> 
> That is not always correct. If you run your code in a separate 
> thread/taskqueue, then you simply wait for this thread/taskqueue to complete 
> somewhere else. This is basically when I need to exit mutexes.

then you are probably using the wrong synchronization primitive.

> 
>>> protecting has possibly been raped while you slept. I have seen too many
>>> instances where people just called msleep and dropped the mutex they
>>> held, picked it up again on wakeup, and then blithely continued on
>>> without checking what happened while they were asleep.
>> Basically, the idea you cannot hold "blocking" locks (mutexes and
>> rwlocks) while sleeping, cames from the difference there is behind
>> turnstiles and sleepqueues.
>>
>> Turnstiles are thought to serve syncronous events, for short period of
>> time (or rather short) while sleepqueues are thought to serve
>> asyncronous events, so that the path to be protected can be
>> definitively bigger. If you fit in the situation you have to call
>> first a blocking lock and later a sleeping lock, probabilly you are
>> just using a wrong locking strategy and you should really revisit it.

that's what I think..
if you are trying to make a nail with threading, then you 
probably should be using a screw to start with.

You obviously know a lot about USB.
You have a good grask of how that works.
Maybe what you need to do is write a small document on how the USB side 
should work, and what needs to be locked, and enter into a cooperative 
partnership with some other people who are active in the locking side of 
things to produce an optimal locking strategy for your module.

> 
> The suggestion is just for convenience. Usually you don't have a recursed 
> mutex to sleep on. It is just to catch some rare cases where you will end up 
> with a doubly locked mutex, which is not part of the ordinary code path. I 
> don't have such cases in the kernel of the new USB stack, but there are some 
> cases in the USB device drivers, which is due to some mutex locking moves. 
> Those I can fix.

Basically you shouldn't have a recursed mutex FULL STOP. We have a couple 
of instances in the kernel where we allow a mutex to recurse, but they had to be 
hard fought, and the general rule is "Don't". If you are recursing on 
a mutex you need to switch to some other method of doing things.
e.g. reference counts, turnstiles, whatever.. use the mutex to create these 
but don't hold the mutex for long enough to need to recurse on it. A mutex should
generally lock, dash-in and work, unlock. We have some cases where that is 
not true, but we are trying to get rid of them, not add more.

also look at read-write type locks for use as much as possible,
(man 9 locking) should be a guide to this (if people were to add to this 
it would be even better).



> 
> My idea was that by allowing recursive mutexes to sleep, you will end up with 
> less panics in the end for the unwary code developer. You just protect your 
> code with mutexes and if they recurse calling synchronous USB functions, you 
> don't have to care.

I think trying to sleep with a recursed mutex should be an instant panic,
even if the mutex IS marked as being allowed to sleep.

> 
>> As you mention, it is not always possible to drop the blocking lock
>> before to sleep since you can break your critical path and free the
>> way for races of various genre. Even unlocking Giant, that is
>> auto-magically done by sleeping primitives, can lead to very difficult
>> to discover races (I can remind one in tty code, old of some months,
>> that can be a good proof-of-concept for that).
>>
> 
> --HPS



More information about the freebsd-hackers mailing list