lots of malloc(M_WAITOK)'s in interrupt context from camisr

Tue Apr 29 12:42:58 PDT 2003

John Baldwin wrote:
> On 29-Apr-2003 Archie Cobbs wrote:
> > [ moving this followup to -arch ]
> >
> > Random thought.. it's always seemed unnatural to me to say that
> > interrupt threads can't sleep.
> >
> > Why couldn't the system be designed so that if an interrupt thread
> > tried to sleep, it would actually sleep but atomically (a) detach
> > itself from the interrupt and (b) spawn a new thread to handle future
> > interrupts. I.e., sleep with "on demand" additional interrupt thread
> > creation.
[ ... ]
> If you need to do more work in your interrupt routine than just wakeups
> and dinking with registers, you can always wake up a software interrupt
> handler or some other random kthread to do things that take a long amount
> of time.  Sleeping in an interrupt thread would destroy interrupt latency
> far worse than it is now.  I'm sure we can all agree that that would be
> unacceptable.  Rather than making the interrupt thread implementation
> very complex with magical spawning kthreads and such, I would prefer that
> driver authors kick up software interrupt threads and the like on their
> own and keep the ithread implementation simple.

Raising a software interrupt to handle the work presents its own
set of problems; for example, one of the biggest factors in TCP
performance is the latency between the interrupt for the packets,
and the running of the NETISR code.  There are knobs to get rid
of this, but they are not on by default.

Adding soft interrupts also presents the possibility that the soft
interrupt thread will be scheduled on a CPU other than the one that
took the hard interrupt.  If this occurs, then every interrupt is
going to result in a cache-bust and a TLB hit, so unless you can
guarantee affinity between the hard interrupt CPU and the soft
interrupt thread, you are pretty screwed.  This is currently also
a factor in TCP performance on SMP machines, when you are using
NETISR, which can effectively cause the mbufs with the data to
CPU-hop during protocol processing.

IMO, it's much better to run interrupts as far to completion as
possible.  The Jeff Mogul paper is instructive here (seperating
between hard and soft interrupt processing is the main recipe for
receiver livelock); so are the Peter Druschel/Mohit Aron/Rice
University papers on the SCALA Server Project.

I think that to effectively handle, for example, disk interrupts
for an NFS server the way you are suggesting would require that
a lot of the disk I/O subsystem be changed to support the moral
equivalent of the DEVICE_POLLING interface in the network code;
that's not really worth it.  Better to do the work when the
hardware asks you to do it, than to pray for rain.

You are still screwed on user space process CPU starvation, no
matter what you do, unless you go to something like weighted
fair share queueing, and the kernel threads participate in the
scheduling process as priority-lending peers for the user space
code.  That's really complicated to implement, although QLinux
seems to have been able to do it with a fairly small team (who
are, admittedly, professor-level professional OS researchers).

-- Terry