Potential source of interrupt aliasing

Mon Apr 11 05:59:50 PDT 2005

Matthew Dillon wrote:
> :...
> :that I mentioned precisely because we don't mask the IOAPIC for fast
> :handlers.  Unfortunetaly, moving the entire OS to this scheme is
> :quite labor-intensive.  It would make just as much sense to implement
> :MSI infrastructre and convert a number of drivers to that.  And again,
> :Linux seems immune to this problem, so it's very intriguing to find out
> :why.
> :
> :Scott
> 
>     kernel/io_apic.c line 1829ish (linux 2.6.9).  And the whole file in 
>     general.
> 
>     It appears that they simply do not EOI the APIC when handling a
>     level triggered interrupt until after the interrupt handler has
>     run.  And indeed, that is what appears to happen.  It looks like they
>     may still be vulnerable due to the way they shutdown an interrupt
>     (but by them the device is presumably not asserting interrupts any more).
>     But for normal interrupt operation they simply do not EOI the APIC.
> 
>     I love the last sentence of this comment... OMG!  They have got to be
>     kidding.

This would explain it.  Unfortunately, I don't think that we can do the 
same in FreeBSD because

1) the ithread won't run right away (unless PREEMPTION is enabled)
2) the ithread might block (remember this is a feature of ithreads =-)

The end result is that the CPU will often go off and do other things for
long periods of time instead of servicing the ithread, all the time with
the EOI not sent and thus timer interrupts not reaching it either.  We
program the APIC delivery to be round-robin, but on the P4 family that
decomposes to only delivering to CPU0.  I think that Linux tries harder
to do more complex CPU routing, but that still wouldn't guarantee to
help the situation we have here.

So that's why a say that a two-tiered interrupt scheme is attractive;
you get the benefit of being able to service and quench the interrupt
right away along with the benefit of being about to block for locks and
resources in the ithread.

[...]

> [ more comments from the linux code ]:
> /*
>  * It appears there is an erratum which affects at least version 0x11
>  * of I/O APIC (that's the 82093AA and cores integrated into various
>  * chipsets).  Under certain conditions a level-triggered interrupt is
>  * erroneously delivered as edge-triggered one but the respective IRR
>  * bit gets set nevertheless.  As a result the I/O unit expects an EOI
>  * message but it will never arrive and further interrupts are blocked
>  * from the source.  The exact reason is so far unknown, but the
>  * phenomenon was observed when two consecutive interrupt requests
>  * from a given source get delivered to the same CPU and the source is
>  * temporarily disabled in between.
>  *
>  * A workaround is to simulate an EOI message manually.  We achieve it
>  * by setting the trigger mode to edge and then to level when the edge
>  * trigger mode gets detected in the TMR of a local APIC for a
>  * level-triggered interrupt.  We mask the source for the time of the
>  * operation to prevent an edge-triggered interrupt escaping meanwhile.
>  * The idea is from Manfred Spraul.  --macro
>  */
> 

This sounds like the famous EOI bug in PCI-Express handling in the 7520
chipset.  Good times.