svn commit: r269052 - head/sys/x86/x86

Marius Strobl marius at FreeBSD.org
Thu Jul 24 10:14:52 UTC 2014


Author: marius
Date: Thu Jul 24 10:14:51 2014
New Revision: 269052
URL: http://svnweb.freebsd.org/changeset/base/269052

Log:
  Intel desktop Haswell CPUs may report benign corrected parity errors (see
  HSD131 erratum in [1]) at a considerable rate. So filter these (default),
  unless logging is enabled. Unfortunately, there really is no better way to
  reasonably implement suppressing these errors than to just skipping them
  in mca_log(). Given that they are reported for bank 0, they'd need to be
  masked in MSR_MC0_CTL. However, P6 family processors require that register
  to be set to either all 0s or all 1s, disabling way more than the one error
  in question when using all 0s there. Alternatively, it could be masked for
  the corresponding CMCI, but that still wouldn't keep the periodic scanner
  from detecting these spurious errors. Apart from that, register contents of
  MSR_MC0_CTL{,2} don't seem to be publicly documented, neither in the Intel
  Architectures Developer's Manual nor in the Haswell datasheets.
  
  Note that while HSD131 actually is only about C0-stepping as of revision
  014 of the Intel desktop 4th generation processor family specification
  update, these corrected errors also have been observed with D0-stepping
  aka "Haswell Refresh".
  
  1: http://www.intel.de/content/dam/www/public/us/en/documents/specification-updates/4th-gen-core-family-desktop-specification-update.pdf
  
  Reviewed by:	jhb
  MFC after:	3 days
  Sponsored by:	Bally Wulff Games & Entertainment GmbH

Modified:
  head/sys/x86/x86/mca.c

Modified: head/sys/x86/x86/mca.c
==============================================================================
--- head/sys/x86/x86/mca.c	Thu Jul 24 10:12:22 2014	(r269051)
+++ head/sys/x86/x86/mca.c	Thu Jul 24 10:14:51 2014	(r269052)
@@ -99,6 +99,10 @@ static int amd10h_L1TP = 1;
 SYSCTL_INT(_hw_mca, OID_AUTO, amd10h_L1TP, CTLFLAG_RDTUN, &amd10h_L1TP, 0,
     "Administrative toggle for logging of level one TLB parity (L1TP) errors");
 
+static int intel6h_HSD131;
+SYSCTL_INT(_hw_mca, OID_AUTO, intel6h_HSD131, CTLFLAG_RDTUN, &intel6h_HSD131, 0,
+    "Administrative toggle for logging of spurious corrected errors");
+
 int workaround_erratum383;
 SYSCTL_INT(_hw_mca, OID_AUTO, erratum383, CTLFLAG_RD, &workaround_erratum383, 0,
     "Is the workaround for Erratum 383 on AMD Family 10h processors enabled?");
@@ -242,12 +246,34 @@ mca_error_mmtype(uint16_t mca_error)
 	return ("???");
 }
 
+static int __nonnull(1)
+mca_mute(const struct mca_record *rec)
+{
+
+	/*
+	 * Skip spurious corrected parity errors generated by desktop Haswell
+	 * (see HSD131 erratum) unless reporting is enabled.
+	 * Note that these errors also have been observed with DO-stepping,
+	 * while the revision 014 desktop Haswell specification update only
+	 * talks about CO-stepping.
+	 */
+	if (rec->mr_cpu_vendor_id == CPU_VENDOR_INTEL &&
+	    rec->mr_cpu_id == 0x306c3 && rec->mr_bank == 0 &&
+	    rec->mr_status == 0x90000040000f0005 && !intel6h_HSD131)
+	    	return (1);
+
+	return (0);
+}
+
 /* Dump details about a single machine check. */
 static void __nonnull(1)
 mca_log(const struct mca_record *rec)
 {
 	uint16_t mca_error;
 
+	if (mca_mute(rec))
+	    	return;
+
 	printf("MCA: Bank %d, Status 0x%016llx\n", rec->mr_bank,
 	    (long long)rec->mr_status);
 	printf("MCA: Global Cap 0x%016llx, Status 0x%016llx\n",


More information about the svn-src-head mailing list