8.1-RC2 MCE caused by some LAPIC/clock changes?

John Baldwin jhb at freebsd.org
Wed Jul 21 17:36:29 UTC 2010


On Wednesday, July 21, 2010 12:44:49 pm Markus Gebert wrote:
> 
> On 21.07.2010, at 14:36, Andriy Gapon wrote:
> 
> > on 21/07/2010 15:25 Markus Gebert said the following:
> >> On 21.07.2010, at 10:33, Andriy Gapon wrote:
> >> 
> >>> on 21/07/2010 03:57 Markus Gebert said the following:
> >>>> Another thing though: Today I compared verbose boot output from 8-stable
> >>>> and the current box. I saw that the ioapic sets up IRQ routing differently
> >>>> on these two systems although the hardware is the same. This seemed not so 
> >>>> interesting at first, but then I noticed that 8-stable sets up two routes
> >>>> (to lapic0 and lapic2, or sometimes lapic3) for IRQ58 (mpt0), while current
> >>>> only uses one route (to lapic0).
> >>> My understanding that it's not "two routes", but re-routing. During early
> >>> boot all interrupts are bound to BSP; later, when APs become online, the
> >>> interrupts are re-distributed among available CPUs.
> >> 
> >> I guess you're right, misinterpretation on my side. Thanks for clarifying this.
> >> 
> >> 
> >> Now being aware of this, it seems to me that in the machdep.lapic_allclocks=0
> >> case, there might just be more interrupts to be assigned/routed due to "more
> >> clocks being used". If that's true, maybe it's just "luck" that in this case
> >> the mpt interrupt gets assigned to lapic0/cpu0 and the box runs fine. I'm just
> >> guessing though, since I have no clue how interrupts are assigned to lapics
> >> exactly (round-robin? some logic?).
> > 
> > Yes, round-robin, for interrupts that not explicitly bound to specific CPUs.
> > The process is deterministic, but hard to predict indeed.
> 
> I see.
> 
> 
> >>>> I used 'cpuset -c -l 0 -x 58' in an attempt to make my 8-stable box behave 
> >>>> like the one running current. Indeed, this seems to have changed IRQ58 to
> >>>> be routed to lapic0 only. And the box was running for hours without showing
> >>>> the symptoms.
> >>>> 
> >>>> I just checked boot verbose outpout of my 8-stable box again (booted with 
> >>>> machdep.lapic_allclocks=0 as mentioned above). And now it seems to have set
> >>>> up IRQ routes just like the current box (one route for IRQ58 to lapic0).
> >>> Not sure how to interpret this properly. One possibility is a hardware
> >>> problem where interrupt message route between ioapic2 and CPU to which lapic3
> >>> belongs is flaky. Perhaps, this might be a FreeBSD problem: it could be that
> >>> the system somehow tells to not set up such routes, but we don't listen.  But
> >>> this is far fetched.
> >> 
> >> 
> >> I'm not sure either. If my "theory" above proved to be true, it would have been
> >> just luck, that 6.x and 7.x (and current) run just fine on the X4100M2. A
> >> (short) test on Ubuntu didn't trigger the problem, so the Linux kernel is
> >> either lucky too by selecting an interrupt route that is "not flaky", or
> >> there's indeed some way to figure out not to use some lapics for some
> >> interrupts. Or we didn't test Linux thoroughly enough.
> > 
> > Yep, it would be interesting to see how interrupts were distributed among CPUs on
> > that Linux.
> 
> 
> Well I can't provide this kind of information about _that_ Ubuntu Linux right now, because it was wiped from the second test machine to test 
current. But we have a few productive X4100M2 running Debian and there it looks like this:
> 
> ----
> # uname -a
> Linux XX 2.6.26-2-amd64 #1 SMP Tue Mar 9 22:29:32 UTC 2010 x86_64 GNU/Linux
> # cat /proc/interrupts 
>            CPU0       CPU1       CPU2       CPU3       
>   0:         36          0          0          1   IO-APIC-edge      timer
>   1:          0          0          0          2   IO-APIC-edge      i8042
>   7:          1          0          0          0   IO-APIC-edge    
>   8:          0          0          0          1   IO-APIC-edge      rtc0
>   9:          0          0          0          0   IO-APIC-fasteoi   acpi
>  12:          0          0          0          4   IO-APIC-edge      i8042
>  14:          0          0          0         74   IO-APIC-edge      ide0
>  21:          0          0          0          2   IO-APIC-fasteoi   ehci_hcd:usb2
>  22:          0          0          1         31   IO-APIC-fasteoi   ohci_hcd:usb1
>  56:      52836  302759221        129      50868   IO-APIC-fasteoi   eth2
>  57:     288921 1070387307        225      98210   IO-APIC-fasteoi   eth3
> 1271:      92146   45282139          9       4885   PCI-MSI-edge      ioc0
> NMI:          0          0          0          0   Non-maskable interrupts
> LOC:  258132347  312890202  166484456  147070084   Local timer interrupts
> RES:  118623017   84540907  100591028  107693244   Rescheduling interrupts
> CAL:     108384      89281     110429     104206   function call interrupts
> TLB:   14719843   24105630   12456528   18955140   TLB shootdowns
> TRM:          0          0          0          0   Thermal event interrupts
> THR:          0          0          0          0   Threshold APIC interrupts
> SPU:          0          0          0          0   Spurious interrupts
> ERR:          1
> ----
> 
> Not sure how to interpret this. At first sight no IRQ58, but I guess they might be using MSI for mpt, which might avoid the problem entirely.

Yes, the FreeBSD mpt(4) driver should also use MSI by default unless you have
disabled it for some reason.  Also, Linux will dynamically reshuffle IRQs
among CPUs based on load, so the I/O APIC/MSI -> CPU routing is more dynamic
in that case.

-- 
John Baldwin


More information about the freebsd-stable mailing list