Re: MCE: Does this look possibly like a slot issue?

From: Larry Rosenman <ler_at_lerctr.org>
Date: Tue, 21 Jun 2022 16:13:28 UTC
Looks like it might be just that, Rodney:
root@freenas[~]# mcelog
Hardware event. This is not a software error.
MCE 0
CPU 14 BANK 8 TSC 525efc019bb6
MISC ac29890200040083 ADDR ee2f6e800
TIME 1655827944 Tue Jun 21 11:12:24 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 83
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
Hardware event. This is not a software error.
MCE 1
CPU 14 BANK 8 TSC 52a513d27f2c
MISC ac29890200041083 ADDR ee2f6e800
TIME 1655827944 Tue Jun 21 11:12:24 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 83
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
Hardware event. This is not a software error.
MCE 2
CPU 14 BANK 8 TSC 53d8cf2ceb4a
MISC ac29890200040582 ADDR ee2f6e800
TIME 1655827944 Tue Jun 21 11:12:24 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 82
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
Hardware event. This is not a software error.
MCE 3
CPU 14 BANK 8 TSC 5e4dae622cb6
MISC ac29890200041181 ADDR ee2f6e800
TIME 1655827944 Tue Jun 21 11:12:24 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 81
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
Hardware event. This is not a software error.
MCE 4
CPU 14 BANK 8 TSC 5eea68fdad4e
MISC ac29890200041784 ADDR ee2f6e800
TIME 1655827944 Tue Jun 21 11:12:24 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 84
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
Hardware event. This is not a software error.
MCE 5
CPU 14 BANK 8 TSC 5eea6e0bbce0
MISC ac29890200044000 ADDR ee2f6e800
TIME 1655827944 Tue Jun 21 11:12:24 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 0
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
Hardware event. This is not a software error.
MCE 6
CPU 12 BANK 8 TSC 5f6cbe9ef2bc
MISC ac29890200041181 ADDR ee2f6e800
TIME 1655827944 Tue Jun 21 11:12:24 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 81
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 20 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
Hardware event. This is not a software error.
MCE 7
CPU 14 BANK 8 TSC 64ba63c66e52
MISC ac29890200041181 ADDR ee2f6e800
TIME 1655827944 Tue Jun 21 11:12:24 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 81
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
Hardware event. This is not a software error.
MCE 8
CPU 14 BANK 8 TSC 659878c17622
MISC ac29890200040282 ADDR ee2f6e800
TIME 1655827944 Tue Jun 21 11:12:24 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 82
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
Hardware event. This is not a software error.
MCE 9
CPU 14 BANK 8 TSC 66b71c1dccf6
MISC ac29890200040183 ADDR ee2f6e800
TIME 1655827944 Tue Jun 21 11:12:24 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 83
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
Hardware event. This is not a software error.
MCE 10
CPU 14 BANK 8 TSC 6be0988610ce
MISC ac29890200040682 ADDR ee2f6e800
TIME 1655827944 Tue Jun 21 11:12:24 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 82
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
Hardware event. This is not a software error.
MCE 11
CPU 14 BANK 8 TSC 6be0995926f8
MISC ac29890200044000 ADDR ee2f6e800
TIME 1655827944 Tue Jun 21 11:12:24 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 0
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
root@freenas[~]# mcelog --dmi
Hardware event. This is not a software error.
MCE 0
CPU 14 BANK 8 TSC 525efc019bb6
MISC ac29890200040083 ADDR ee2f6e800
TIME 1655827951 Tue Jun 21 11:12:31 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 83
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
WARNING: SMBIOS data is often unreliable. Take with a grain of salt!
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Nanya
Serial Number: 642264CD
Asset Tag:
Part Number: NT4GC72B4NA1NL-BE
Hardware event. This is not a software error.
MCE 1
CPU 14 BANK 8 TSC 52a513d27f2c
MISC ac29890200041083 ADDR ee2f6e800
TIME 1655827951 Tue Jun 21 11:12:31 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 83
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Nanya
Serial Number: 642264CD
Asset Tag:
Part Number: NT4GC72B4NA1NL-BE
Hardware event. This is not a software error.
MCE 2
CPU 14 BANK 8 TSC 53d8cf2ceb4a
MISC ac29890200040582 ADDR ee2f6e800
TIME 1655827951 Tue Jun 21 11:12:31 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 82
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Nanya
Serial Number: 642264CD
Asset Tag:
Part Number: NT4GC72B4NA1NL-BE
Hardware event. This is not a software error.
MCE 3
CPU 14 BANK 8 TSC 5e4dae622cb6
MISC ac29890200041181 ADDR ee2f6e800
TIME 1655827951 Tue Jun 21 11:12:31 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 81
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Nanya
Serial Number: 642264CD
Asset Tag:
Part Number: NT4GC72B4NA1NL-BE
Hardware event. This is not a software error.
MCE 4
CPU 14 BANK 8 TSC 5eea68fdad4e
MISC ac29890200041784 ADDR ee2f6e800
TIME 1655827951 Tue Jun 21 11:12:31 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 84
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Nanya
Serial Number: 642264CD
Asset Tag:
Part Number: NT4GC72B4NA1NL-BE
Hardware event. This is not a software error.
MCE 5
CPU 14 BANK 8 TSC 5eea6e0bbce0
MISC ac29890200044000 ADDR ee2f6e800
TIME 1655827951 Tue Jun 21 11:12:31 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 0
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Nanya
Serial Number: 642264CD
Asset Tag:
Part Number: NT4GC72B4NA1NL-BE
Hardware event. This is not a software error.
MCE 6
CPU 12 BANK 8 TSC 5f6cbe9ef2bc
MISC ac29890200041181 ADDR ee2f6e800
TIME 1655827951 Tue Jun 21 11:12:31 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 81
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 20 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Nanya
Serial Number: 642264CD
Asset Tag:
Part Number: NT4GC72B4NA1NL-BE
Hardware event. This is not a software error.
MCE 7
CPU 14 BANK 8 TSC 64ba63c66e52
MISC ac29890200041181 ADDR ee2f6e800
TIME 1655827951 Tue Jun 21 11:12:31 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 81
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Nanya
Serial Number: 642264CD
Asset Tag:
Part Number: NT4GC72B4NA1NL-BE
Hardware event. This is not a software error.
MCE 8
CPU 14 BANK 8 TSC 659878c17622
MISC ac29890200040282 ADDR ee2f6e800
TIME 1655827951 Tue Jun 21 11:12:31 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 82
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Nanya
Serial Number: 642264CD
Asset Tag:
Part Number: NT4GC72B4NA1NL-BE
Hardware event. This is not a software error.
MCE 9
CPU 14 BANK 8 TSC 66b71c1dccf6
MISC ac29890200040183 ADDR ee2f6e800
TIME 1655827951 Tue Jun 21 11:12:31 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 83
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Nanya
Serial Number: 642264CD
Asset Tag:
Part Number: NT4GC72B4NA1NL-BE
Hardware event. This is not a software error.
MCE 10
CPU 14 BANK 8 TSC 6be0988610ce
MISC ac29890200040682 ADDR ee2f6e800
TIME 1655827951 Tue Jun 21 11:12:31 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 82
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Nanya
Serial Number: 642264CD
Asset Tag:
Part Number: NT4GC72B4NA1NL-BE
Hardware event. This is not a software error.
MCE 11
CPU 14 BANK 8 TSC 6be0995926f8
MISC ac29890200044000 ADDR ee2f6e800
TIME 1655827951 Tue Jun 21 11:12:31 2022
MCG status:
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 0
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: ac298902
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 22 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44 Step 2
DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
Device Locator: P2-DIMM2C
Bank Locator: BANK14
Manufacturer: Nanya
Serial Number: 642264CD
Asset Tag:
Part Number: NT4GC72B4NA1NL-BE
root@freenas[~]#

On 06/21/2022 11:06 am, Rodney W. Grimes wrote:
>> 
>> 
>> Swapped 2 DIMMS, now we wait for the ZFS ARC to fill and start using 
>> all
>> the memory.
> 
> Depending on the results of that one thing that is often overlooked
> when trying to trouble shoot memory systems in modern Intel systems
> is the fact that the DIMM now talks directly to the CPU chip that
> has the memory controller built into it.  THUS these "slot" related
> ECC/Parity/blowup errors can actually be the CPU and/or the CPU
> socket and/or the seating of the CPU in the socket.
> 
> So if the error sticks with the DIMM slot and not the DIMM
> module the next thing I would try would be a CPU chip reseat,
> including a good inspection of the socket for for a damaged
> pin.  Also look at the lands on the CPU chip itself, and you
> can even try swaping CPU chips to see if it follows the
> CPU or the socket, much as you do with a DIMM.
> 
> 
>> 
>> On 06/20/2022 7:59 pm, Larry Rosenman wrote:
>> 
>> > SuperMicro X8DTN+
>> >
>> > 2 Processors, 6-core/12-Thread. CPU: Intel(R) Xeon(R) CPU
>> > E5645  @ 2.40GHz (2400.20-MHz K8-class CPU)
>> >
>> > I'll bring it down and swap DIMMS around
>> >
>> > On 06/20/2022 7:57 pm, Ultima wrote:
>> >
>> > Hey Larry,
>> >
>> > One red flag I am seeing is that the error is being produced on
>> > the same CPU/bank with each error you have provided so far.
>> >
>> > Can you try and follow my original recommendation and swap
>> > currently installed DIMM with the problem DIMM slot and see
>> > if anything changes?
>> >
>> > Can you also provide the motherboard model? Also, do you
>> > have multiple CPUs installed in this system?
>> >
>> > Best regards,
>> > Richard Gallamore
>> >
>> > On Mon, Jun 20, 2022 at 5:41 PM Larry Rosenman <ler@lerctr.org> wrote:
>> >
>> > Yes and Yes.
>> >
>> > On 06/20/2022 7:37 pm, Ultima wrote:
>> >
>> > Are you sure that the module you replaced it with was good?
>> > Are you sure you replaced the correct module?
>> >
>> > Best regards,
>> > Richard Gallamore
>> >
>> > On Mon, Jun 20, 2022 at 5:23 PM Larry Rosenman <ler@lerctr.org> wrote:
>> >
>> > I'm seeing them constantly:
>> >
>> > root@freenas[~]# mcelog --dmi
>> > Hardware event. This is not a software error.
>> > MCE 0
>> > CPU 22 BANK 8 TSC 20aab486464a
>> > MISC ac29890200046444 ADDR ee2f6e800
>> > TIME 1655770989 Mon Jun 20 19:23:09 2022
>> > MCG status:
>> > Memory read ECC error
>> > Memory corrected error count (CORE_ERR_CNT): 1
>> > Memory transaction Tracker ID (RTId): 44
>> > Memory DIMM ID of error: 0
>> > Memory channel ID of error: 1
>> > Memory ECC syndrome: ac298902
>> > STATUS 8c0000400001009f MCGSTATUS 0
>> > MCGCAP 1c09 APICID 34 SOCKETID 0
>> > CPUID Vendor Intel Family 6 Model 44 Step 2
>> > WARNING: SMBIOS data is often unreliable. Take with a grain of salt!
>> > DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
>> > Device Locator: P2-DIMM2C
>> > Bank Locator: BANK14
>> > Manufacturer: Hyundai
>> > Serial Number: 40F3C20F
>> > Asset Tag:
>> > Part Number: HMT151R7BFR4C-H9
>> > Hardware event. This is not a software error.
>> > MCE 1
>> > CPU 22 BANK 8 TSC 296dfcc82582
>> > MISC ac29890200041381 ADDR ee2f6e800
>> > TIME 1655770989 Mon Jun 20 19:23:09 2022
>> > MCG status:
>> > Memory read ECC error
>> > Memory corrected error count (CORE_ERR_CNT): 1
>> > Memory transaction Tracker ID (RTId): 81
>> > Memory DIMM ID of error: 0
>> > Memory channel ID of error: 1
>> > Memory ECC syndrome: ac298902
>> > STATUS 8c0000400001009f MCGSTATUS 0
>> > MCGCAP 1c09 APICID 34 SOCKETID 0
>> > CPUID Vendor Intel Family 6 Model 44 Step 2
>> > DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
>> > Device Locator: P2-DIMM2C
>> > Bank Locator: BANK14
>> > Manufacturer: Hyundai
>> > Serial Number: 40F3C20F
>> > Asset Tag:
>> > Part Number: HMT151R7BFR4C-H9
>> > Hardware event. This is not a software error.
>> > MCE 2
>> > CPU 22 BANK 8 TSC 2a5604a6a070
>> > MISC ac29890200044281
>> > TIME 1655770989 Mon Jun 20 19:23:09 2022
>> > MCG status:
>> > Memory ECC error occurred during scrub
>> > Memory corrected error count (CORE_ERR_CNT): 1
>> > Memory transaction Tracker ID (RTId): 81
>> > Memory DIMM ID of error: 0
>> > Memory channel ID of error: 1
>> > Memory ECC syndrome: ac298902
>> > STATUS 88000040000200cf MCGSTATUS 0
>> > MCGCAP 1c09 APICID 34 SOCKETID 0
>> > CPUID Vendor Intel Family 6 Model 44 Step 2
>> > Hardware event. This is not a software error.
>> > MCE 3
>> > CPU 22 BANK 8 TSC 31e141418eb8
>> > MISC ac29890200046a4a ADDR ee2f6e800
>> > TIME 1655770989 Mon Jun 20 19:23:09 2022
>> > MCG status:
>> > Memory read ECC error
>> > Memory corrected error count (CORE_ERR_CNT): 1
>> > Memory transaction Tracker ID (RTId): 4a
>> > Memory DIMM ID of error: 0
>> > Memory channel ID of error: 1
>> > Memory ECC syndrome: ac298902
>> > STATUS 8c0000400001009f MCGSTATUS 0
>> > MCGCAP 1c09 APICID 34 SOCKETID 0
>> > CPUID Vendor Intel Family 6 Model 44 Step 2
>> > DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
>> > Device Locator: P2-DIMM2C
>> > Bank Locator: BANK14
>> > Manufacturer: Hyundai
>> > Serial Number: 40F3C20F
>> > Asset Tag:
>> > Part Number: HMT151R7BFR4C-H9
>> > Hardware event. This is not a software error.
>> > MCE 4
>> > CPU 22 BANK 8 TSC 3a014afee106
>> > MISC ac29890200046646 ADDR ee2f6e800
>> > TIME 1655770989 Mon Jun 20 19:23:09 2022
>> > MCG status:
>> > Memory read ECC error
>> > Memory corrected error count (CORE_ERR_CNT): 1
>> > Memory transaction Tracker ID (RTId): 46
>> > Memory DIMM ID of error: 0
>> > Memory channel ID of error: 1
>> > Memory ECC syndrome: ac298902
>> > STATUS 8c0000400001009f MCGSTATUS 0
>> > MCGCAP 1c09 APICID 34 SOCKETID 0
>> > CPUID Vendor Intel Family 6 Model 44 Step 2
>> > DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
>> > Device Locator: P2-DIMM2C
>> > Bank Locator: BANK14
>> > Manufacturer: Hyundai
>> > Serial Number: 40F3C20F
>> > Asset Tag:
>> > Part Number: HMT151R7BFR4C-H9
>> > Hardware event. This is not a software error.
>> > MCE 5
>> > CPU 22 BANK 8 TSC 41d1dbef1a6a
>> > MISC ac29890200046141 ADDR ee2f6e800
>> > TIME 1655770989 Mon Jun 20 19:23:09 2022
>> > MCG status:
>> > Memory read ECC error
>> > Memory corrected error count (CORE_ERR_CNT): 1
>> > Memory transaction Tracker ID (RTId): 41
>> > Memory DIMM ID of error: 0
>> > Memory channel ID of error: 1
>> > Memory ECC syndrome: ac298902
>> > STATUS 8c0000400001009f MCGSTATUS 0
>> > MCGCAP 1c09 APICID 34 SOCKETID 0
>> > CPUID Vendor Intel Family 6 Model 44 Step 2
>> > DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
>> > Device Locator: P2-DIMM2C
>> > Bank Locator: BANK14
>> > Manufacturer: Hyundai
>> > Serial Number: 40F3C20F
>> > Asset Tag:
>> > Part Number: HMT151R7BFR4C-H9
>> > Hardware event. This is not a software error.
>> > MCE 6
>> > CPU 22 BANK 8 TSC 4a1b1ecef446
>> > MISC ac29890200046a4a ADDR ee2f6e800
>> > TIME 1655770989 Mon Jun 20 19:23:09 2022
>> > MCG status:
>> > Memory read ECC error
>> > Memory corrected error count (CORE_ERR_CNT): 1
>> > Memory transaction Tracker ID (RTId): 4a
>> > Memory DIMM ID of error: 0
>> > Memory channel ID of error: 1
>> > Memory ECC syndrome: ac298902
>> > STATUS 8c0000400001009f MCGSTATUS 0
>> > MCGCAP 1c09 APICID 34 SOCKETID 0
>> > CPUID Vendor Intel Family 6 Model 44 Step 2
>> > DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
>> > Device Locator: P2-DIMM2C
>> > Bank Locator: BANK14
>> > Manufacturer: Hyundai
>> > Serial Number: 40F3C20F
>> > Asset Tag:
>> > Part Number: HMT151R7BFR4C-H9
>> > Hardware event. This is not a software error.
>> > MCE 7
>> > CPU 22 BANK 8 TSC 527bc27db776
>> > MISC ac29890200040386 ADDR ee2f6e800
>> > TIME 1655770989 Mon Jun 20 19:23:09 2022
>> > MCG status:
>> > Memory read ECC error
>> > Memory corrected error count (CORE_ERR_CNT): 1
>> > Memory transaction Tracker ID (RTId): 86
>> > Memory DIMM ID of error: 0
>> > Memory channel ID of error: 1
>> > Memory ECC syndrome: ac298902
>> > STATUS 8c0000400001009f MCGSTATUS 0
>> > MCGCAP 1c09 APICID 34 SOCKETID 0
>> > CPUID Vendor Intel Family 6 Model 44 Step 2
>> > DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
>> > Device Locator: P2-DIMM2C
>> > Bank Locator: BANK14
>> > Manufacturer: Hyundai
>> > Serial Number: 40F3C20F
>> > Asset Tag:
>> > Part Number: HMT151R7BFR4C-H9
>> > Hardware event. This is not a software error.
>> > MCE 8
>> > CPU 22 BANK 8 TSC 5aa4ecdd795a
>> > MISC ac29890200046646 ADDR ee2f6e800
>> > TIME 1655770989 Mon Jun 20 19:23:09 2022
>> > MCG status:
>> > Memory read ECC error
>> > Memory corrected error count (CORE_ERR_CNT): 1
>> > Memory transaction Tracker ID (RTId): 46
>> > Memory DIMM ID of error: 0
>> > Memory channel ID of error: 1
>> > Memory ECC syndrome: ac298902
>> > STATUS 8c0000400001009f MCGSTATUS 0
>> > MCGCAP 1c09 APICID 34 SOCKETID 0
>> > CPUID Vendor Intel Family 6 Model 44 Step 2
>> > DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
>> > Device Locator: P2-DIMM2C
>> > Bank Locator: BANK14
>> > Manufacturer: Hyundai
>> > Serial Number: 40F3C20F
>> > Asset Tag:
>> > Part Number: HMT151R7BFR4C-H9
>> > root@freenas[~]#
>> >
>> > and I replaced the DIMM yesterday :(
>> >
>> > On 06/20/2022 7:19 pm, Ultima wrote:
>> >
>> > Hey Larry,
>> >
>> > It is possible it's the motherboard itself, but it's rare. The way I
>> > would determine this is to swap the DIMM module with another
>> > populated slot on the motherboard and see if the error migrated
>> > to the new slot or not. Also, this error doesn't necessarily mean
>> > there is a problem that needs to be addressed. If you have been
>> > running the system for many months and you see ECC errors a
>> > handful of times, it can probably be safely ignored.
>> >
>> > Best regards,
>> > Richard Gallamore
>> >
>> > On Mon, Jun 20, 2022 at 3:14 PM Larry Rosenman <ler@lerctr.org> wrote:
>> > I've gotten a BUNCH of these on my TrueNAS server.  I've replaced this
>> > DIMM a couple of times, and still the MCE's continue.
>> > Is it possible it's Motherboard slot issue?
>> >
>> > Hardware event. This is not a software error.
>> > MCE 8
>> > CPU 22 BANK 8 TSC 5aa4ecdd795a
>> > MISC ac29890200046646 ADDR ee2f6e800
>> > TIME 1655762472 Mon Jun 20 17:01:12 2022
>> > MCG status:
>> > Memory read ECC error
>> > Memory corrected error count (CORE_ERR_CNT): 1
>> > Memory transaction Tracker ID (RTId): 46
>> > Memory DIMM ID of error: 0
>> > Memory channel ID of error: 1
>> > Memory ECC syndrome: ac298902
>> > STATUS 8c0000400001009f MCGSTATUS 0
>> > MCGCAP 1c09 APICID 34 SOCKETID 0
>> > CPUID Vendor Intel Family 6 Model 44 Step 2
>> > DDR3 DIMM 800 Mhz Other Width 72 Data Width 64 Size 4 GB
>> > Device Locator: P2-DIMM2C
>> > Bank Locator: BANK14
>> > Manufacturer: Hyundai
>> > Serial Number: 40F3C20F
>> > Asset Tag:
>> > Part Number: HMT151R7BFR4C-H9
>> >
>> > --
>> > Larry Rosenman                     http://www.lerctr.org/~ler
>> > Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org
>> > US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
>> 
>> --
>> Larry Rosenman                     http://www.lerctr.org/~ler
>> Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org
>> US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
>> 
>> --
>> Larry Rosenman                     http://www.lerctr.org/~ler
>> Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org
>> US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
>> 
>> --
>> Larry Rosenman                     http://www.lerctr.org/~ler
>> Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org
>> US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106
>> 
>> --
>> Larry Rosenman                     http://www.lerctr.org/~ler
>> Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org
>> US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106

-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org
US Mail: 5708 Sabbia Dr, Round Rock, TX 78665-2106