MCA messages in dmesg

John Baldwin jhb at freebsd.org
Thu Sep 30 17:25:20 UTC 2010


On Thursday, September 30, 2010 12:33:24 pm Adam Vande More wrote:
> On Thu, Sep 30, 2010 at 8:40 AM, John Baldwin <jhb at freebsd.org> wrote:
> 
> > On Thursday, September 30, 2010 2:49:24 am Adam Vande More wrote:
> > > For awhile now, my home server has been acting up.  Actually it had a bad
> > > set of RAM long ago, replaced and it and worked fine.  It's been weird
> > again
> > > now, and I've found this in dmesg:
> > >
> > > MCA: Bank 0, Status 0xf200000000000800
> > > MCA: Global Cap 0x0000000000000806, Status 0x0000000000000000
> > > MCA: Vendor "GenuineIntel", ID 0x6fb, APIC ID 2
> > > MCA: CPU 2 UNCOR PCC OVER BUSL0 Source ERR Memory
> > > MCA: Bank 0, Status 0xf200000000000800
> > > MCA: Global Cap 0x0000000000000806, Status 0x0000000000000000
> > > MCA: Vendor "GenuineIntel", ID 0x6fb, APIC ID 3
> > > MCA: CPU 3 UNCOR PCC OVER BUSL0 Source ERR Memory
> >
> > Are you getting a panic when this happens?
> >
> 
> It's symptoms vary, but yes I think so.  The box is headless, so I depend on
> logs after boot to see what happens.  Sometimes the box panics and powers
> off with no warning, and other times it just seems to hit a stall state
> where everything become unresponsive and I have to manually power off.

Ok, it is a memory error of some sort, but mcelog claims it is a transaction
timeout rather than an ECC error, per se:

HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 2 BANK 0 
MCG status:
MCi status:
Error overflow
Uncorrected error
Error enabled
Processor context corrupt
MCA: BUS Level-0 Local-CPU-originated-request Generic Memory-access Request-timeout Error
BQ_DCU_READ_TYPE BQ_ERR_HARD_TYPE BQ_ERR_HARD_TYPE
STATUS f200000000000800 MCGSTATUS 0
MCGCAP 806 APICID 2 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 15
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 3 BANK 0 
MCG status:
MCi status:
Error overflow
Uncorrected error
Error enabled
Processor context corrupt
MCA: BUS Level-0 Local-CPU-originated-request Generic Memory-access Request-timeout Error
BQ_DCU_READ_TYPE BQ_ERR_HARD_TYPE BQ_ERR_HARD_TYPE
STATUS f200000000000800 MCGSTATUS 0
MCGCAP 806 APICID 3 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 15

I've no idea what specific hardware is busted (memory or motherboard or CPU),
but I suspect something is likely broken.

-- 
John Baldwin


More information about the freebsd-stable mailing list