Weird PCI interrupt delivery problem (resolution, sort of)
craig at tobuj.gank.org
Mon Jan 23 18:25:16 PST 2006
On Fri, Jan 20, 2006 at 03:42:21PM -0500, John Baldwin wrote:
> On Thu, Jan 19, 2006 at 10:17:39PM -0700, Scott Long wrote:
> > This points to a bus coherency problem. I wonder if your BIOS is
> > incorrectly setting the memory region of the apics as cachable. You'll
> > want to bug Baldwin about this.
> Hmm, well, you can actually try the PAT patch if you are feeling brave as it
> maps all devices (including APICs) as uncacheable.
Tried the updated PAT patch (with s/pmap_unmapbios/pmap_unmap_bios/ to
get ACPI to compile). Unfortunately if it is a caching problem, PAT
isn't able to fix it. Same result as stock kernel -- interrupts stop
arriving after a dozen or so. AFAICT the local APIC is the only
memory-mapped I/O region that seems to be problematic.
Instead of writing the value twice, I also tried inserting an
__asm("nop") before the write with no effect. Also, a single write to
an unrelated area doesn't help:
+static volatile int dummyeoi;
+ dummyeoi = 1;
lapic->eoi = 0;
+ dummyeoi = 2;
I'm _reasonably_ certain that marking dummyeoi volatile and leaving it
uninitialized will prevent gcc from optimizng that out. Forcing R/W
cycles (++dummyeoi) before and after doesn't work either.
A DELAY(1) before the lapic->eoi write does the trick, but DELAY does
lots of complicated things so I don't know how useful of a data point
I'm probably missing something, but if bad cache behavior was causing
writes to the lapic EOI register to not always take effect, wouldn't the
_next_ irq (even if it's a different line) cause the one that's
currently pending to be acknowledged?
More information about the freebsd-hackers