cvs commit: src/sys/dev/fxp if_fxp.c if_fxpvar.h

Tue Apr 29 11:44:30 PDT 2003

On 29-Apr-2003 Andrew Gallatin wrote:
> 
> M. Warner Losh writes:
>  > In message: <20030429133708.A84234 at grasshopper.cs.duke.edu>
>  >             Andrew Gallatin <gallatin at cs.duke.edu> writes:
>  > : What is this deadlock with bus_teardown_intr?  Could we possibly fix
>  > : this some other way than by adding (mostly) useless code to the
>  > : critical path? 
>  > 
>  > I can remove it if it really annoys you that much.  The locking code
>  > itself will swamp the extra load/compare that happens.  It will put
> 
> 
> "A million here, a million there, pretty soon it adds up"
> 
> I'm just disappointed to see things bloat, and bloat, and bloat more. 
> FreeBSD used to be fast..
> 
> 
>  > the race back into the code, however.  This likely means that some
>  > higher level of locking is necessary so that we can make sure that the
>  > interrupts can't happen once detach starts.
>  > 
>  > Warner
> 
> 
> How about just replacing the intr handler with a dummy?  That seems
> fairly easy and doesn't require pessimizing critical path code.
> 
> Would you mind trying something like the following (sorry, I'm in the
> middle of moving and don't have machines handy to test ... most of my
> lab is packed).  I don't even have a tree handy to do diffs.
> 
> Drew
> 
> 
> Add this to kern/kern_intr.c:
> 
> void
> ithread_orphan_handler(void *arg)
> {
> }
> 
> int
> ithread_replace_handler(void *cookie, driver_intr_t func)
> {
>         struct intrhand *handler = (struct intrhand *)cookie;
>         struct ithd *ithread;
>#ifdef INVARIANTS
>         struct intrhand *ih;
>#endif
>         ithread = handler->ih_ithread;
>         KASSERT(ithread != NULL,
>             ("interrupt handler \"%s\" has a NULL interrupt thread",
>                 handler->ih_name));
> 
>         if (handler == NULL)
>                 return (EINVAL);
> 
>       mtx_lock(&ithread->it_lock);
>       
>       handler->ih_handler = func;
>       mtx_unlock(&ithread->it_lock);
>       return (0);
> }
> 
> 
> And make if_fxp.c look like this:
> 
> /*
>  * Detach interface.
>  */
> static int
> fxp_detach(device_t dev)
> {
>         struct fxp_softc *sc = device_get_softc(dev);
>         int s;
> 
>         FXP_LOCK(sc);
>         s = splimp();
> 
>         sc->gone = 1;
>         /*
>          * Close down routes etc.
>          */
>         ether_ifdetach(&sc->arpcom.ac_if);
> 
>         /*
>          * Stop DMA and drop transmit queue, but disable interruptsfirst.
>          */
>         CSR_WRITE_1(sc, FXP_CSR_SCB_INTRCNTL, FXP_SCB_INTR_DISABLE);
>         fxp_stop(sc);
>       if(ithread_replace_handler(sc->ih, ithread_orphan_handler))
>               panic("ithread_replace_handler didn't");                          
>         FXP_UNLOCK(sc);
> 
>         /*
>          * Unhook interrupt before dropping lock. This is to prevent
>          * races with fxp_intr().
>          */
>         bus_teardown_intr(sc->dev, sc->irq, sc->ih);
>         sc->ih = NULL;
> 
>         splx(s);
> 
>         /* Release our allocated resources. */
>         fxp_release(sc);
>         return (0);
> }
> 
> static void
> fxp_intr(void *xsc)
> {
>         struct fxp_softc *sc = xsc;
>         struct ifnet *ifp = &sc->sc_if;
>         u_int8_t statack;
> 
>         FXP_LOCK(sc);
>#ifdef DEVICE_POLLING
>         if (ifp->if_flags & IFF_POLLING) {
> <..>

First off, it's a gross API violation.  ithread's are supposed to be
transparent (except that you can use locks in your interrupt handlers).
Secondly, you still have the same race as if you just removed the gone
condition.  We don't hold the ithread lock while executing handlers,
so there is nothing preventing the handler from being executed on another
CPU concurrently with your detach function.  In fact, it could easily
be blocked on the FXP lock.  You do your magic pointer foo, then unlock
the fxp.  The unlock releases the interrupt handler on another CPU
which happily executes _after_ the completion of bus_teardown_intr()
which has all sorts of bad failure modes, like panicing when it tries
to FXP_UNLOCK at the end because back on the first CPU you've done a
mtx_destroy().

-- 

John Baldwin <jhb at FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/