8.1-RC2 MCE caused by some LAPIC/clock changes?
Markus Gebert
markus.gebert at hostpoint.ch
Wed Jul 21 16:44:51 UTC 2010
On 21.07.2010, at 14:36, Andriy Gapon wrote:
> on 21/07/2010 15:25 Markus Gebert said the following:
>> On 21.07.2010, at 10:33, Andriy Gapon wrote:
>>
>>> on 21/07/2010 03:57 Markus Gebert said the following:
>>>> Another thing though: Today I compared verbose boot output from 8-stable
>>>> and the current box. I saw that the ioapic sets up IRQ routing differently
>>>> on these two systems although the hardware is the same. This seemed not so
>>>> interesting at first, but then I noticed that 8-stable sets up two routes
>>>> (to lapic0 and lapic2, or sometimes lapic3) for IRQ58 (mpt0), while current
>>>> only uses one route (to lapic0).
>>> My understanding that it's not "two routes", but re-routing. During early
>>> boot all interrupts are bound to BSP; later, when APs become online, the
>>> interrupts are re-distributed among available CPUs.
>>
>> I guess you're right, misinterpretation on my side. Thanks for clarifying this.
>>
>>
>> Now being aware of this, it seems to me that in the machdep.lapic_allclocks=0
>> case, there might just be more interrupts to be assigned/routed due to "more
>> clocks being used". If that's true, maybe it's just "luck" that in this case
>> the mpt interrupt gets assigned to lapic0/cpu0 and the box runs fine. I'm just
>> guessing though, since I have no clue how interrupts are assigned to lapics
>> exactly (round-robin? some logic?).
>
> Yes, round-robin, for interrupts that not explicitly bound to specific CPUs.
> The process is deterministic, but hard to predict indeed.
I see.
>>>> I used 'cpuset -c -l 0 -x 58' in an attempt to make my 8-stable box behave
>>>> like the one running current. Indeed, this seems to have changed IRQ58 to
>>>> be routed to lapic0 only. And the box was running for hours without showing
>>>> the symptoms.
>>>>
>>>> I just checked boot verbose outpout of my 8-stable box again (booted with
>>>> machdep.lapic_allclocks=0 as mentioned above). And now it seems to have set
>>>> up IRQ routes just like the current box (one route for IRQ58 to lapic0).
>>> Not sure how to interpret this properly. One possibility is a hardware
>>> problem where interrupt message route between ioapic2 and CPU to which lapic3
>>> belongs is flaky. Perhaps, this might be a FreeBSD problem: it could be that
>>> the system somehow tells to not set up such routes, but we don't listen. But
>>> this is far fetched.
>>
>>
>> I'm not sure either. If my "theory" above proved to be true, it would have been
>> just luck, that 6.x and 7.x (and current) run just fine on the X4100M2. A
>> (short) test on Ubuntu didn't trigger the problem, so the Linux kernel is
>> either lucky too by selecting an interrupt route that is "not flaky", or
>> there's indeed some way to figure out not to use some lapics for some
>> interrupts. Or we didn't test Linux thoroughly enough.
>
> Yep, it would be interesting to see how interrupts were distributed among CPUs on
> that Linux.
Well I can't provide this kind of information about _that_ Ubuntu Linux right now, because it was wiped from the second test machine to test current. But we have a few productive X4100M2 running Debian and there it looks like this:
----
# uname -a
Linux XX 2.6.26-2-amd64 #1 SMP Tue Mar 9 22:29:32 UTC 2010 x86_64 GNU/Linux
# cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
0: 36 0 0 1 IO-APIC-edge timer
1: 0 0 0 2 IO-APIC-edge i8042
7: 1 0 0 0 IO-APIC-edge
8: 0 0 0 1 IO-APIC-edge rtc0
9: 0 0 0 0 IO-APIC-fasteoi acpi
12: 0 0 0 4 IO-APIC-edge i8042
14: 0 0 0 74 IO-APIC-edge ide0
21: 0 0 0 2 IO-APIC-fasteoi ehci_hcd:usb2
22: 0 0 1 31 IO-APIC-fasteoi ohci_hcd:usb1
56: 52836 302759221 129 50868 IO-APIC-fasteoi eth2
57: 288921 1070387307 225 98210 IO-APIC-fasteoi eth3
1271: 92146 45282139 9 4885 PCI-MSI-edge ioc0
NMI: 0 0 0 0 Non-maskable interrupts
LOC: 258132347 312890202 166484456 147070084 Local timer interrupts
RES: 118623017 84540907 100591028 107693244 Rescheduling interrupts
CAL: 108384 89281 110429 104206 function call interrupts
TLB: 14719843 24105630 12456528 18955140 TLB shootdowns
TRM: 0 0 0 0 Thermal event interrupts
THR: 0 0 0 0 Threshold APIC interrupts
SPU: 0 0 0 0 Spurious interrupts
ERR: 1
----
Not sure how to interpret this. At first sight no IRQ58, but I guess they might be using MSI for mpt, which might avoid the problem entirely.
Markus
More information about the freebsd-stable
mailing list