miibus + USB = problem

Wed Aug 9 08:21:55 UTC 2006

On Wednesday 09 August 2006 08:07, M. Warner Losh wrote:
> In message: <200608021437.55500.hselasky at c2i.net>
>
>             Hans Petter Selasky <hselasky at c2i.net> writes:
> : I am currently in the progress of porting "if_aue.c" to my new USB API,
> : and have hit a problem that needs to be solved. The problem is that the
> : USB system sleeps under "miibus_xxx". Questions are:
> :
> : Is this allowed?
>
> Yes.
>
> : What locks are held when these functions are called ?
>
> Whatever locks are held when they are called :-).
>
> miibus is called in the following instances:
>
>  (1) during probe/attach of children (bus_generic_probe is the
>             typical cause)
>  (2) During the 'tick' routine.  The tick routine is called
>             from a timeout typically.  In the aue driver, for example,
>             this is done with timeout/untimeout.
>
> There's been rumblings for a genric mii /dev entry that can be used
> for things like kicking off the autonegotiation, etc, but that's not
> in the tree now, nor is it likely to be soon.
>
> The aue driver takes out the AUE_LOCK in these routines, and in
> detach, it unregisters the timeout.  Alas, it is stupid, and does this
> with the lock held, thus ensuring deadlock if the timeout fires after
> the lock is acquired, but before the untimeout can complete (meaning
> that the timeout routine would sleep forever waiting for the lock,
> oops).  This is why you can't run the detach routine locked in most
> cases.  

Yes, all of that is gone now. I use "callout_init_mtx()" and that solves the 
problem, except it does not wait for the last mtx_lock()/mtx_unlock(), in 
case of a race :-(

You need to hold a lock during detach. Else you can risk that the callbacks 
will re-start functions you have already shut down, like USB transfers, and 
then you never get detached.

> You have to make sure that all other threads have exited, and 
> that can be tricky.

Did you look at the new if_aue, at:

http://www.turbocat.net/~hselasky/isdn4bsd/sources/src/sys/dev/usb/

>
> : The problem with USB devices, is that the read-register process is very
> : slow. It can take up to several milliseconds. And if the device is
> : suddenly detached one has to think about adding exit code everywhere.
>
> Yes.  One does.  There's no such thing as a free lunch in the kernel,
> alas.
>
> : The solution I see with USB devices is something like this:
> :
> : if (sc->device_gone) {
> :
> :   exit mutexes ;
> :
> :   kthread_exit(0);
> : }
> :
> : Of course I cannot "kthread_exit()" when the call comes from
> : read/write/ioctl, because there is a stack, that expects a returning
> : call. If the kernel code was objective C, then maybe one could throw an
> : exception or do something alike so that the processor gets out of the
> : USB-read procedure.
>
> Yes.  You can't longjmp your way out of this problem. :-)  There's no
> thread to kill here...
>
> : Solutions:
> :
> : 1) use USB hardware polling, not releasing any mutexes, simply using
> : DELAY(), until read/writes complete.
>
> That's insanely ugly.

Yes, I agree on that. But if you're out of time, and just need to have 
something working solid, then just use hardware polling. That's why I 
provided that option.

>
> : 2) pre-read all read registers regularly. How can I do this with
> : "miibus"?
>
> You can't do that either.  You have to use the aue 'registers' to read
> the mii bus management registers.

Yes, but I can call "mii_pollstat()" from the "config thread", and then cache 
the two integers it stores the current status in.

>
> The only way to deal with this is to have all the places that read the
> registers deal with getting an error.  If the device really is dying,
> you can set a flag so that further reads also return an error to catch
> places where the code is sloppy and doesn't check the return value.
>

This is not going to be a perfect solution. You really need to run all USB 
reads/writes for a separate thread, that you tear down first.

> In fact, the aue_csr_read_* routines do exactly this already, and much
> of the code is written to check for the dying.  Apart from the
> deadlock in the locking unwiding (I'm sure there are others, because
> we also call if_detach with this lock held...)

I do something similar.

--HPS