cvs commit: src/sys/dev/fxp if_fxp.c if_fxpvar.h

John Baldwin jhb at FreeBSD.org
Tue Apr 29 12:24:14 PDT 2003


On 29-Apr-2003 Andrew Gallatin wrote:
> 
> John Baldwin writes:
> 
>  > 
>  > First off, it's a gross API violation.  ithread's are supposed to be
>  > transparent (except that you can use locks in your interrupt handlers).
> 
> The API seems designed for inefficiency, so it needs violating ;)
> 
>  > Secondly, you still have the same race as if you just removed the gone
>  > condition.  We don't hold the ithread lock while executing handlers,
>  > so there is nothing preventing the handler from being executed on another
>  > CPU concurrently with your detach function.  In fact, it could easily
>  > be blocked on the FXP lock.  You do your magic pointer foo, then unlock
>  > the fxp.  The unlock releases the interrupt handler on another CPU
>  > which happily executes _after_ the completion of bus_teardown_intr()
> 
> I assumed dropping the fxp lock would be enough to encourage any
> pending handlers to finish.  Would simply tsleep()'ing for a second
> work?

It would encourage them, but they can easily be executed in parallel
on another CPU with the rest of the detach function.  Throwing in random
sleeps isn't going to address that since the other CPU could always get
another interrupt while this one sleeps.

> If needed we should violate the API even more to check to see if any
> handlers are pending.  Anything is better than adding instructions to
> the critical path.

Where would you do this check for more pending handlers?  In detach?
What does detach do when it's true?  The problem is that detach may
be too late.  Take this sequence:


        CPU 0           CPU 1
        --------        --------
        foo_detach()    ...
        FOO_LOCK()      interrupt for foo
        ...             foo_intr()
        ...             FOO_LOCK() and blocks
        bus_teardown_intr()

Hmm.  Now we have a nice deadlock unless we take very drastic measures
like trying to kill threads out from under ourselves and leaving held
locks, etc. in their wake.  Also, warner's check for gone outside of
the fxp lock is not safe, since it can still lose a race where the gone
bit gets set after the check but before foo_intr() gets the lock.

Now, we have to synchronize between foo_intr() and foo_detach() somehow.
We need some way in foo_intr() to know to do that.  That's going to
involve some kind of check.  Kernel panics are not an acceptable
alternative.  So, how do you propose to solve the above sequence w/o
using a flag check in foo_intr() for synchronization?  I'm not really
prepared to try and kill a thread executing on another CPU as that is
1) hard and 2) error prone (we just dinked with registers in detach to
turn things like interrupts off, allowing the interrupt routine to
execute can result in things like interrupts getting re-enabled, which
would be quite bad, also, if we wanted to do that we would have to
unwind all the locks held by the ithread and just _hope_ that it doesn't
leave anything in an inconsistent state to do so, along with adding
WITNESS-like pessimizations to the _default_ case for locking to maintain
enough state to allow for such an unwind).

> We really need to think about efficiency.  Our 5.x performance sucks.
> Really sucks.  We're being nickled and dimed to death by extra
> instructions here, there, and everywhere.

Unfortunately 5.x attempts to run with a thread-safe kernel, and that
involves extra overhead to work around races that 4.x didn't even have
to dream about.  In theory the increased performance should come from
increased parallelism at the cost of increased overhead.  If FreeBSD
decides that a thread-safe kernel isn't worth it then I guess we can
pitch 5.x and start over.

-- 

John Baldwin <jhb at FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


More information about the cvs-src mailing list