Interrupt Threads
John Baldwin
jhb at freebsd.org
Fri Sep 17 15:23:49 UTC 2010
I have wanted to rework some of the interrupt threads stuff and enable
interrupt filters by default for a while. I finally sat down and
hacked out a new ithreads implementation at BSDCan and the following week.
The new ithreads stuff moves away from dedicated threads per handlers or irqs.
Instead, it adopts a model more akin to what Solaris does (though probably not
completely identical). Each CPU has a queue of "pending handlers". When an
interrupt fires, all of the handlers for that interrupt are placed on to that
CPU's queue. There is a pool of hardware interrupt threads. If the current
CPU does not already have an active hardware interrupt thread, it grabs a free
one from the pool, pins it to the current CPU, and schedules it. The ithread
continues to drain interrupt handlers from its CPU's queue until the queue is
empty. Once that happens it disassociates itself from the CPU and goes back
into the free pool. The effect is that interrupt handlers are now sort of
like DPCs in Windows.
If an interrupt handler blocks on a turnstile and there are other handlers
pending for this CPU, then the current ithread is divorced from the current
CPU and a new ithread is allocated for the current CPU.
If we ever fail to allocate an ithread for a given CPU, then a flag is set.
All ithreads check that flag before going idle, and if it is set they find the
first CPU that needs an ithread and move to that CPU and start draining
events.
The ithread pool can be dynamically resized at runtime via sysctl, but it
can't be smaller than NCPU * 2 or larger than the total number of handlers.
Interrupt filters fit into this nicely since this avoids the problem with old
interrupt filters that if you fix its design bug it may need to schedule
multiple ithreads. Now it still only schedules at most one ithread per
interrupt.
To handle masking the interrupt and unmasking it when filters w/o handlers
complete, I use a simple reference count with atomic ops to keep track of the
number of queued handlers that need the interrupt masked and unmask it once
the count drops to 0.
Software interrupts still use a dedicated ithread, but the queue of pending
handlers lives in the ithread, not in the CPU.
I've also added some extensions to the current ithreads stuff based on some
tricks that existing drivers use. Specifically, an interrupt handler can now
call hwi_sched() on itself to reschedule itself at the back of the current
CPU's queue. Thus, you can have NIC interrupt handlers do cooperative
timesharing by just punting after N packets and using hwi_sched() to
reschedule themselves. I also added a new type of interrupt
handler that is registered with INTR_MANUAL. It is never automatically
scheduled, but a filter can schedule it.
As a test, I've ported the igb(4) driver to this framework. It uses
hwi_sched() and an INTR_MANUAL handler for link events to replace almost all
of the taskqueue usage in igb(4). (The multiqueue transmit bits still need a
task for one case, but all the interrupt handler stuff is now "simpler").
Some downsides to this approach include:
1) If you have two busy devices whose interrupts both go to the same CPU but
via different IRQs, in the old model those threads could run concurrently on
separate CPUs, but in the new model the handlers are tied to the same CPU and
compete for CPU time on that CPU. In other words, the new model really wants
interrupts to be evenly distributed amongst CPUs to work properly. Not
entirely sure what I think about that.
2) Many folks find the ability to see how much CPU IRQ N's thread has used in
top useful, but this loses all of that since there is no longer a tight
coupling between IRQs and threads.
One unresolved issue is that the cardbus code currently uses a filter that
returns just FILTER_SCHEDULE_THREAD without FILTER_HANDLED. This is not
supported in the new code. I have some ideas on how to fix the cardbus code
(most likely using wrappers around the child interrupt handlers) but need to
has the details out with Warner.
A second unresolved issue is that interrupt storm detection is currently
broken. I have some thoughts on how to readd it, but it will likely be a bit
tricky.
The code currently lives in p4 at //depot/user/jhb/intr/... I have also put
up a patch at http://www.freebsd.org/~jhb/patches/intr_threads.patch. This
patch includes the changes to the igb(4) driver.
--
John Baldwin
More information about the freebsd-arch
mailing list