Sandy Bridge and MCA UNCOR PCC (problem + solution)

Stefan Esser se at freebsd.org
Wed Nov 30 10:56:50 UTC 2011


Am 25.11.2011 21:13, schrieb Thomas Zander:
> List,
> 
> here's a rant about a recent problem I had and the surprising
> solution.
> 
> I recently had to investigate weird unexpected issues on a workstation.
> Relevant hardware: Asus P8B-WS, Xeon E3-1260L (Sandy Bride, Intel
> HD-2000 graphics)
> 
> Since we don't have kms and friends in STABLE yet, and I can live
> without accelerated video for now, I am using the vesa driver on this
> machine.
> 
> Initially, this had two major drawbacks:
> 1) 1280x1024 resolution utterly sucks on a 1680x1050 screen.
> 2) Reproducable unhandled MCA events (and subsequent kernel panics)
> like the following whenever I switch from X to console:
> 
> panic: machine check trap
> ...
> MCA: CPU 6 UNCOR UNCOR UNCOR PCC PCC PCC internal error 2internal error
> 2PCC internal error 2
> 
> The kernel dump _always_ showed something like:
> 
> current process         = 11 (idle: cpu3)
> trap number             = 28
> #1 0xffffffff805db167 at panic+0x187
> #2 0xffffffff808c6820 at trap_fatal+0x290
> #3 0xffffffff808c6d3a at trap+0x10a
> #4 0xffffffff808ae894 at calltrap+0x8
> #5 0xffffffff801f6b9a at acpi_cpu_idle+0x20a
> #6 0xffffffff806003af at sched_idletd+0x11f
> #7 0xffffffff805afe6f at fork_exit+0x11f
> #8 0xffffffff808aedde at fork_trampoline+0xe
> 
> mcelog did not help decoding the MCA output and the "internal error2"
> message made me suspect that this CPU was maybe just broken.
> However, due to my utter inabilty of producing the slightest other
> problem with this machine (constantly heavy CPU + IO load) or any
> problem using other operating systems I derived the wild speculation
> that there might be something with the Sandy Bridge silicon which this
> exact sequence of actions on FreeBSD reliably could trigger.
> 
> Long story short: I got the latest Bios from Asus for this Board. The
> changelog of course said absolutely nothing about fixing any known
> problem.
> Upon boot I entered the Bios settings and noticed that it apparently
> contained a microcode update. The changelog for microcode from Intel is
> of course non-existing.
> 
> And since this boot there has not been a single problem with this
> machine. Vesa now works in 1680x1050 and switching from X to console
> and back does not trigger MCA events anymore.
> 
> I like to believe that for the first time a microcode update from Intel
> fixed my specific problem.
> 
> Anyway, now the story is on the list and for Google to find, in case
> anyone else has this problem as well.

Thank you for reporting this. I had a somewhat similar problem with
an i2600K (not overclocked) on an ASUS P8H67-M EVO.

I had somewhat similar issues, which have also been resolved by a
BIOS upgrade (to version 2001 for that mainboard).

The system locked up hard (did not even respond to pressing the reset
button) after attempting to switch from X11 to a text console, or even
on shutdown from X11. This started a few month back (it had been
working, when the system was new), but due to lack of debug access and
no way to obtain a core dump, I had just given up on starting a local
X11 server.

Meanwhile I had updated the BIOS to the latest version while trying
to resolve another issue, and after reading your message I retried
starting the X11/vesa server and found, that it does no longer
completely lock up the system on exit. Text consoles are unusable,
once the X server had been started (no stable signal, monitor looses
sync and switches off and on again after a few seconds but only shows
hardly readable characters).

Anyway, I can now use the local X11 and still cleanly shutdown the
system without the need to remove electrical power.

The new version of the microcode patches appears to be "1a", but I
have no idea, what version the previous BIOS contained.

Regards, STefan


More information about the freebsd-stable mailing list