panic: mutex Giant not owned at /usr/src/sys/kern/tty_ttydisc.c:1127

Sat Jan 24 15:29:53 PST 2009

Maksim Yevmenkin wrote:
> On Sat, Jan 24, 2009 at 1:55 AM, Hans Petter Selasky <hselasky at c2i.net> wrote:
>> On Saturday 24 January 2009, Maksim Yevmenkin wrote:
>>> Hans Petter,
>>>
>>> i'm sorry, did i mention there is no sleeping in netgraph? :-)
>> Can you elaborate this? Is netgraph run from fast interrupt context? So that
>> only spin locks are possible? Or is it run from a thread?
> 
> i already told you that netgraph is essentially single-threaded. for
> all intents and purposes think that only _ONE_ thread services _ALL_
> netgraph nodes. if something can not be done immediately,  netgraph
> queues it and returns back to it later. _ANY_ delay/sleep would stall
> _ENTIRE_ netgraph. think poll/select/callback driven programming
> model.

it depends on the delay.
if it's going to be 200usecs.. ok (with a grumble)
if it's going to be 5mSec it is NOT ok..

netgraph itself does nothing that would block
(except one mutex that is guaranteed to be really quick
and is probably always satisfied in the 'spin' stage.)
we have thought about having multipel netgraph threads.
(and there is a patch to do this.. I don';t think it has been
added yet). so you are both right..
a node should avoid doing a mutex if there is any chance that that
mutex may end up taking any real time to satisfy, due to mutex 
chaining etc.)  I almos think there should be a mutex type
that is called  'leaf' mutex.  no other mutex may be taken
while it is held..
In other words, you can use a mutex in Netgraph IFF you can guarantee
that all potential holders of the mutex will never acquire
any other locks or do anything that may take real time.
it's probably ok to take a lock to protect updating a statistics block
if you know that the only other holders will be the same code you are 
writing, in another thread, and that that code will never need to do 
anything except update the memory.

So the question becomes. If HPS uses a mutex, what are the other users 
of the mutex doing?

> 
>>> from what i can see you are _NOT_ using _SPIN_ mutexes (aka spin
>>> locks). you are using regular mutexes. let me quote locking(9) man
>>> page
>>>
>>> "
>>> Mutexes
>>>      Basically (regular) mutexes will deschedule the thread if the mutex
>>> can not be acquired.  A non-spin mutex can be considered to be equivalent
>>> to getting a write lock on an rw_lock (see below),
>>> "
>>>
>>> so, if thread can not get mutex it will be descheduled. this
>>> absolutely can not happen in netgraph!
>> There are mutexes inside the taskqueue aswell. The problem will be the same
>> there if you don't use a so-called fast tasqueue.
> 
> well, yes. technically, its not 100% correct, but it has to be like it
> because there is simply no way around this. if you read the blob at
> the top of the ng_ubt2.c you know that it uses regular sc_mbufq_mtx
> (which can sleep). let me quote the blob here again
> 
> " Access to the outgoing queues and task flags is controlled by the
>   sc_mbufq_mtx lock. It is an unavoidable evil. Again, sc_mbufq_mtx should
>   really be a spin lock."
> 
> the sc_mbufq_mtx is _NOT_ marked as SPIN mutex only because WITNESS
> has to know about spin locks. it is generally NOT a good idea to add a
> spin lock when someone feels like it.
> 
> now back to taskqueue_enqueue(), yes, taskqueue_enqueue_fast() is more
> appropriate here. however, taskqueue_enqueue() was chosen because
> taskqueue_enqueue() does not do anything that could cause sleep in
> TQ_LOCK()/TQ_UNLOCK() (its just basically a lookup).
> 
> from taskqueue man page
> 
> "Enqueueing a task does not perform any memory allocation which makes
> it suitable for calling from an interrupt handler."
> 
> USB_BUS_LOCK()/USB_BUS_UNLOCK() are different is this regard because
> those locks used when usb2 does things with hardware. also, like i
> said, there could be few usb devices sharing the same bus and thus the
> same USB_BUS_LOCK.
> 
> once again the rule is: NO SLEEPING/STALLING IN NETGRAPH. it DOES NOT
> mean that you can not use mutexes. it only means that you have to be
> very careful of how the mutex is used.
> 
>>> first of all, i do not think crashes are caused by detach(). in fact,
>>> detach() is clean. i've tested it and it worked for me. i tried to
>>> start/stop device while doing flood l2ping. i also tried to yank the
>>> device while doing flood l2ping. it works correctly. i think, the
>>> issue is related to stalled transfers. there is still something wrong
>>> with the code path that deals with stalled transfers. stalls do not
>>> happen on my test box, so i can not test it. also there is NO code
>>> duplication. asynchronous path is required to decouple netgraph from
>>> usb2.
>> Only if netgraph is run from fast interrupt context.
> 
> no, no, no. we are going back in circles. i would very much like to do
> things synchronously, but i can not. imo, i've given you plenty
> reasoning behind async design. please understand, asyn design has to
> stay. even if "sync" code appears to work, its still wrong.
> 
>>> regular mutexes can sleep. we are not allowed to sleep in netgraph.
>>> therefor we must transition out of netgraph context before calling
>>> into any code that tries to grab regular mutex. the async design is
>>> there not because i want to make things complicated. its there because
>>> it is needed.
>> I think there are two definitions of sleeping.
>>
>> 1) When a thread is waiting for a mutex it is not sleeping in the same way
>> like if it was to call "tsleep()".
>>
>> 2) When a thread is waiting for a wakeup it is surely sleeping, which can
>> happen inside sx_lock() and tsleep().
> 
> when i say "sleeping" i mean that the thread that is trying to get the
> lock is de-scheduled. that is, it is not running anymore. i'm not
> talking about [tm]sleep() etc. just the fact that thread is not
> running.
> 
> oh, and to possibly answer your next question, freebsd's mutexes are
> ADAPTIVE, meaning that thread will will spin on a mutex for a little
> bit if it can not grab it right away BEFORE going to sleep. this means
> that the sc_mbufq_mtx mutex is very likely to be an equivalent of a
> spin lock because it is mosty used to protect queue while en/dequeuing
> mbuf etc. the same COULD apply for TQ_LOCK in taskqueue_enqeue().
> 
> thanks,
> max
> _______________________________________________
> freebsd-current at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe at freebsd.org"