panic: mutex Giant not owned at /usr/src/sys/kern/tty_ttydisc.c:1127

Fri Jan 23 12:44:39 PST 2009

Hi Maksim,

On Friday 23 January 2009, Maksim Yevmenkin wrote:
> On Fri, Jan 23, 2009 at 11:29 AM, Hans Petter Selasky <hselasky at c2i.net> 
wrote:
> > Hi Maksim,
> >
> > First of all I've made some patches which try to tidy up the USB blutooth
> > driver. Please see:
> >
> > http://perforce.freebsd.org/chv.cgi?CH=156577
> > http://perforce.freebsd.org/chv.cgi?CH=156579
>
> thanks! unfortunately, i have few problems with those patches. please
> read below.
>
> >> that should not be needed (in theory) but i will add it.
> >
> > Yes! We are multithreaded!
>
> please tell me that you read the blob about locking strategy at the
> top of ng_ubt2.c file. when ubt_task() is pending, it holds a
> reference to the netgraph node. that means the pointer is still
> pointing to a valid structure, but the node is marked as "dead" and
> NG_NODE[_NOT]_VALID() macros can be use to check for that. so, even if
> task is still pending while the rest of ng_ubt2 stuff is gone, it does
> not matter because task holds netgraph node. so the next time task is
> executed it will see that the node is "dead", release the node and
> simply return doing nothing.

This kind of complexity is unneccesary. We should be able to drain a node, or 
sleep until a node is released.

0) stop all USB activity
1) free node
2) wait until the node is actually freed
3) free USB resources which might access node fields

Conclusion:

No checks are required inside any of the USB callbacks.

>
> >> > 2) You drain all USB transfers: "usb2_transfer_drain()" called
> >> > unlocked.
> >>
> >> yes, usb2_transfer_unsetup() does that, does it not?
> >
> > That is correct, but sometimes you need to stop the transfers at an
> > earlier point. It depends on your design.
>
> when detach() is called its already out of driver's hands. all it can
> do now is to call unsetup(), or am i missing something here?

The correct thing is to do a usb2_transfer_drain() which will wait until the 
callback has been called back, if a transfer is pending.

In USB2 a "usb2_start_hardware()" call is always paired with a USB transfer 
callback, no matter what you do! This way of design has been chosen so that 
you can do refcounts inside the callback!

> > I see in your code that you try to do things Asynchronously. This
> > complicates stuff quite a lot. In my patches I've tried to do some things
> > synchronously.
>
> no, No, NO (sorry for shouting). we _CAN_NOT_ sleep in netgraph.
> period. you _CAN_NOT_ grab usb interface mutexes unless you can
> absolutely guarantee that they will not sleep for any significant
> amount of time. 

usb2_transfer_start() and usb2_transfer_stop() are non-blocking and will never 
sleep. The exception is "usb2_transfer_drain()" which can sleep for a few 
milliseconds and is only called when the hook is disconnected. I do not see 
that as a problem versus having "X" more checks in the code.

> netgraph is essentially single threaded and by 
> sleeping you will stall entire netgraph subsystem and not just current
> thread.

If we sleep 10 milliseconds on a UBT hook disconnect, is that a big problem? 
How often will UBT hook disconnect be called?

>
> so, objection #1 (the biggest one) is: please leave sc_mbufq_mtx and
> ubt_task() glue in place. it is needed as far as i can tell.

What I see is that USB already has a separate process handling the callback so 
that when you call "usb2_transfer_start()" this other process is woken up. 
Are you sure that there is a measurable gain using "ubt_task()" versus the 
way I'm doing it?

>
> ok, that makes sense. except the case when hardware is gone and
> transfer callback is called with IOERROR (or something like that). i
> guess the question is can callback be called with IOERROR for any
> reason other than hardware departure?

Yes, if there is interference on the cable. It is quite valid to setup an USB 
transaction after that the hardware has dissappeared, in xxx_detach() for 
example. Remeber USB is a polling bus. If the device is not there (I.E. not 
responds) the USB transfer will simply terminate, either immediately or after 
a timeout.

>
> objection #2 (somewhat major). please do NOT remove NG_NODE_REF() call
> in detach() before calling ng_rmnode_self() (unless you remove
> NG_NODE_UNREF() below as well). again the reason here is to tell
> netgraph node that we are dying, but make sure we keep node_p pointer
> pointing to a valid structure as long as possible. to summarize,

We are already holding a reference to the node from ubt_attach(), right? Why 
do we need another node reference?

>
> NG_NODE_REF(node) <- grab our reference to ensure node pointer remains
> valid ng_rmnode_self(node) <- tell node we are dying, mark node as "dead"
> but NOT free node structure
> /* do other stuff */
> NG_NODE_UNREF(node) <- drop our reference and possibly free node pointer
>
>
> more of a question than objection. why you insist on having single
> mutex for both interfaces? isoc transfers callbacks will be called
> much more often that control, bulk and interrupt callbacks, so
> wouldn't it make sense to use different lock for isoc transfers?

Because a single mutex is much easier to maintain and I see no gain fine 
graining more for a bluetooth device :-)

> changes in attach() and usb2_config structures are fine (i guess). but
> please keep asynchronous design in place. it is required.

Is it possible we can make a compromise here? From my experience synchronous 
code is smaller and easier to understand than asynchronous code! The only 
disadvantage about synchronous code is that more threads might be required. 
And hence Netgraph is single threaded other signalling might be delayed, but 
is that a real problem if we are talking about delays around 1x10ms ever time 
you disconnect the hook of an UBT node?

--HPS