MCA messages after upgrade to 8.2-BEAT1

John Baldwin jhb at freebsd.org
Wed Dec 22 14:59:13 UTC 2010


On Wednesday, December 22, 2010 7:41:25 am Miroslav Lachman wrote:
> Dec 21 12:42:26 kavkaz kernel: MCA: Bank 0, Status 0xd40e400000000833
> Dec 21 12:42:26 kavkaz kernel: MCA: Global Cap 0x0000000000000105, 
> Status 0x0000000000000000
> Dec 21 12:42:26 kavkaz kernel: MCA: Vendor "AuthenticAMD", ID 0x40f33, 
> APIC ID 0
> Dec 21 12:42:26 kavkaz kernel: MCA: CPU 0 COR OVER BUSLG Source DRD Memory
> Dec 21 12:42:26 kavkaz kernel: MCA: Address 0x236493c0

You are getting corrected ECC errors in your RAM.  You see them once an hour
because we poll the machine check registers once an hour.  If this happens
constantly you might have a DIMM that is dying?

% ~/mcelog --ascii < foo.txt 
mcelog: Cannot open /dev/mem for DMI decoding: Permission denied
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 0 data cache 
ADDR 236493c0 
  Data cache ECC error (syndrome 1c)
       bit46 = corrected ecc error
       bit62 = error overflow (multiple errors)
  bus error 'local node origin, request didn't time out
             data read mem transaction
             memory access, level generic'
STATUS d40e400000000833 MCGSTATUS 0
MCGCAP 105 APICID 0 SOCKETID 0 
CPUID Vendor AMD Family 15 Model 67
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 1 instruction cache 
ADDR 2a1c9440 
  Instruction cache ECC error
       bit46 = corrected ecc error
       bit62 = error overflow (multiple errors)
  bus error 'local node origin, request didn't time out
             instruction fetch mem transaction
             memory access, level generic'
STATUS d400400000000853 MCGSTATUS 0
MCGCAP 105 APICID 0 SOCKETID 0 
CPUID Vendor AMD Family 15 Model 67
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 2 bus unit 
  L2 cache ECC error
  Bus or cache array error
       bit46 = corrected ecc error
       bit62 = error overflow (multiple errors)
  bus error 'local node origin, request didn't time out
             prefetch mem transaction
             memory access, level generic'
STATUS d000400000000863 MCGSTATUS 0
MCGCAP 105 APICID 0 SOCKETID 0 
CPUID Vendor AMD Family 15 Model 67
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 4 northbridge 
MISC e00d0fff00000000 ADDR 2cac9678 
  Northbridge RAM ECC error
  ECC syndrome = 1c
       bit33 = err cpu1
       bit46 = corrected ecc error
       bit59 = misc error valid
       bit62 = error overflow (multiple errors)
  bus error 'local node origin, request didn't time out
             generic read mem transaction
             memory access, level generic'
STATUS dc0e400200000813 MCGSTATUS 0
MCGCAP 105 APICID 0 SOCKETID 0 
CPUID Vendor AMD Family 15 Model 67
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 1 0 data cache 
ADDR 23649640 
  Data cache ECC error (syndrome 1c)
       bit46 = corrected ecc error
       bit62 = error overflow (multiple errors)
  bus error 'local node origin, request didn't time out
             data read mem transaction
             memory access, level generic'
STATUS d40e400000000833 MCGSTATUS 0
MCGCAP 105 APICID 1 SOCKETID 0 
CPUID Vendor AMD Family 15 Model 67
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 1 1 instruction cache 
ADDR 2a1c9440 
  Instruction cache ECC error
       bit46 = corrected ecc error
       bit62 = error overflow (multiple errors)
  bus error 'local node origin, request didn't time out
             instruction fetch mem transaction
             memory access, level generic'
STATUS d400400000000853 MCGSTATUS 0
MCGCAP 105 APICID 1 SOCKETID 0 
CPUID Vendor AMD Family 15 Model 67
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 1 2 bus unit 
  L2 cache ECC error
  Bus or cache array error
       bit46 = corrected ecc error
       bit62 = error overflow (multiple errors)
  bus error 'local node origin, request didn't time out
             prefetch mem transaction
             memory access, level generic'
STATUS d000400000000863 MCGSTATUS 0
MCGCAP 105 APICID 1 SOCKETID 0 
CPUID Vendor AMD Family 15 Model 67


-- 
John Baldwin


More information about the freebsd-stable mailing list