Sandy Bridge and MCA UNCOR PCC (problem + solution)

Thomas Zander thomas.e.zander at googlemail.com
Fri Nov 25 20:38:17 UTC 2011


List,

here's a rant about a recent problem I had and the surprising
solution.

I recently had to investigate weird unexpected issues on a workstation.
Relevant hardware: Asus P8B-WS, Xeon E3-1260L (Sandy Bride, Intel
HD-2000 graphics)

Since we don't have kms and friends in STABLE yet, and I can live
without accelerated video for now, I am using the vesa driver on this
machine.

Initially, this had two major drawbacks:
1) 1280x1024 resolution utterly sucks on a 1680x1050 screen.
2) Reproducable unhandled MCA events (and subsequent kernel panics)
like the following whenever I switch from X to console:

panic: machine check trap
...
MCA: CPU 6 UNCOR UNCOR UNCOR PCC PCC PCC internal error 2internal error
2PCC internal error 2

The kernel dump _always_ showed something like:

current process         = 11 (idle: cpu3)
trap number             = 28
#1 0xffffffff805db167 at panic+0x187
#2 0xffffffff808c6820 at trap_fatal+0x290
#3 0xffffffff808c6d3a at trap+0x10a
#4 0xffffffff808ae894 at calltrap+0x8
#5 0xffffffff801f6b9a at acpi_cpu_idle+0x20a
#6 0xffffffff806003af at sched_idletd+0x11f
#7 0xffffffff805afe6f at fork_exit+0x11f
#8 0xffffffff808aedde at fork_trampoline+0xe

mcelog did not help decoding the MCA output and the "internal error2"
message made me suspect that this CPU was maybe just broken.
However, due to my utter inabilty of producing the slightest other
problem with this machine (constantly heavy CPU + IO load) or any
problem using other operating systems I derived the wild speculation
that there might be something with the Sandy Bridge silicon which this
exact sequence of actions on FreeBSD reliably could trigger.

Long story short: I got the latest Bios from Asus for this Board. The
changelog of course said absolutely nothing about fixing any known
problem.
Upon boot I entered the Bios settings and noticed that it apparently
contained a microcode update. The changelog for microcode from Intel is
of course non-existing.

And since this boot there has not been a single problem with this
machine. Vesa now works in 1680x1050 and switching from X to console
and back does not trigger MCA events anymore.

I like to believe that for the first time a microcode update from Intel
fixed my specific problem.

Anyway, now the story is on the list and for Google to find, in case
anyone else has this problem as well.

Best regards
Riggs

-- 
- Now the world has gone to bed  | Now I lay me down to sleep        -
-- Darkness won't engulf my head | Try to count electric sheep      --
--- I can see by infra-red       | Sweet dream wishes you can keep ---
---- How I hate the night        | How I hate the night           ----


More information about the freebsd-stable mailing list