panic: mutex Giant not owned at /usr/src/sys/kern/tty_ttydisc.c:1127

Sat Jan 24 01:52:59 PST 2009

On Saturday 24 January 2009, Maksim Yevmenkin wrote:
> Hans Petter,
>
> i'm sorry, did i mention there is no sleeping in netgraph? :-)

Can you elaborate this? Is netgraph run from fast interrupt context? So that 
only spin locks are possible? Or is it run from a thread?

>
> from what i can see you are _NOT_ using _SPIN_ mutexes (aka spin
> locks). you are using regular mutexes. let me quote locking(9) man
> page
>
> "
> Mutexes
>      Basically (regular) mutexes will deschedule the thread if the mutex
> can not be acquired.  A non-spin mutex can be considered to be equivalent
> to getting a write lock on an rw_lock (see below),
> "
>
> so, if thread can not get mutex it will be descheduled. this
> absolutely can not happen in netgraph!

There are mutexes inside the taskqueue aswell. The problem will be the same 
there if you don't use a so-called fast tasqueue.
>
>
> shutdown method is called as part of ng_rmnode_self() and drop the
> reference that node was born with. the extra reference before
> ng_rmnode_self() is to ensure that node pointer is still valid after
> ng_rmnode_self() returns. otherwise there is a change that node
> pointer becomes invalid while after ng_rmnode_self() calls shutdown
> method.

Ok, I need to fix that.

>
> first of all, i do not think crashes are caused by detach(). in fact,
> detach() is clean. i've tested it and it worked for me. i tried to
> start/stop device while doing flood l2ping. i also tried to yank the
> device while doing flood l2ping. it works correctly. i think, the
> issue is related to stalled transfers. there is still something wrong
> with the code path that deals with stalled transfers. stalls do not
> happen on my test box, so i can not test it. also there is NO code
> duplication. asynchronous path is required to decouple netgraph from
> usb2.

Only if netgraph is run from fast interrupt context.

> regular mutexes can sleep. we are not allowed to sleep in netgraph.
> therefor we must transition out of netgraph context before calling
> into any code that tries to grab regular mutex. the async design is
> there not because i want to make things complicated. its there because
> it is needed.

I think there are two definitions of sleeping.

1) When a thread is waiting for a mutex it is not sleeping in the same way 
like if it was to call "tsleep()".

2) When a thread is waiting for a wakeup it is surely sleeping, which can 
happen inside sx_lock() and tsleep().

--HPS