From nobody Tue Apr 11 11:45:54 2023 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4PwkYx3S3Rz44jx5 for ; Tue, 11 Apr 2023 11:45:57 +0000 (UTC) (envelope-from Lee.MATTHEWS.external@stormshield.eu) Received: from mail.stormshield.eu (mail.stormshield.eu [91.212.116.25]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.stormshield.eu", Issuer "Stormshield Servers CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4PwkYw47nwz3NT5 for ; Tue, 11 Apr 2023 11:45:56 +0000 (UTC) (envelope-from Lee.MATTHEWS.external@stormshield.eu) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=stormshield.eu header.s=signer header.b=Zukkzv7W; spf=pass (mx1.freebsd.org: domain of Lee.MATTHEWS.external@stormshield.eu designates 91.212.116.25 as permitted sender) smtp.mailfrom=Lee.MATTHEWS.external@stormshield.eu; dmarc=pass (policy=quarantine) header.from=stormshield.eu DKIM-Signature: v=1; a=rsa-sha256; d=stormshield.eu; s=signer; c=simple/simple; t=1681213555; h=from:subject:to:date:message-id; bh=/6/GZpPr1d7fjz2jdQch8vOJAmXOqtoSS4YOtuksSdA=; b=Zukkzv7WQFNp6Zjcg2IG8RnR6Nvu1psDnMJ5aOEq6ZY/GnS/nFf99B/fGozDv/jdQNreyoDb5G9 SIWAYC6gmTmj2ggXOLwiOZyWueRwj4AhHr6IHnY/JcXh44EvYq0XDh1piwuxNiW/lascekHqKSLLX FwA9LatUPeEuUhNCOAW7Cssnh5Bc9b312tSqnLApJgIZHFBywC6JWG9hzGG4gHjUaqAPeI1NpCB08 cxztX+4PA/d4QwgArg6T1CZxfeifuLaRh73L0TGwCkGn2ZtO6JOXvtXu5JnMbp9cEN9ZxNfzK5+5k 5so/+rILnHM26QodysR3WJf8hdCD4N2Hw4/A== Received: from ICTDCCEXCH001.one.local (10.180.4.1) by ICTDCCEXCH001.one.local (10.180.4.1) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.26; Tue, 11 Apr 2023 13:45:54 +0200 Received: from ICTDCCEXCH001.one.local ([::1]) by ICTDCCEXCH001.one.local ([::1]) with mapi id 15.02.1118.026; Tue, 11 Apr 2023 13:45:54 +0200 From: Lee MATTHEWS To: "freebsd-hackers@FreeBSD.org" Subject: BINIT and BERR signals in MCA Thread-Topic: BINIT and BERR signals in MCA Thread-Index: AQHZbGp/DawEN79MiUet3ycHaMLakw== Date: Tue, 11 Apr 2023 11:45:54 +0000 Message-ID: <4bd3e1017a104598ab92e658f25b5367@stormshield.eu> Accept-Language: en-GB, fr-FR, en-US Content-Language: en-GB X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.100.17.62] Content-Type: multipart/alternative; boundary="_000_4bd3e1017a104598ab92e658f25b5367stormshieldeu_" List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 X-Spamd-Result: default: False [-3.50 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-0.997]; SUBJECT_ENDS_SPACES(0.50)[]; DMARC_POLICY_ALLOW(-0.50)[stormshield.eu,quarantine]; R_SPF_ALLOW(-0.20)[+ip4:91.212.116.25]; R_DKIM_ALLOW(-0.20)[stormshield.eu:s=signer]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; MIME_TRACE(0.00)[0:+,1:+,2:~]; TO_DN_EQ_ADDR_ALL(0.00)[]; MLMMJ_DEST(0.00)[freebsd-hackers@FreeBSD.org]; RCVD_TLS_LAST(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; ARC_NA(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; ASN(0.00)[asn:49068, ipnet:91.212.116.0/24, country:FR]; HAS_XOIP(0.00)[]; DKIM_TRACE(0.00)[stormshield.eu:+]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[] X-Rspamd-Queue-Id: 4PwkYw47nwz3NT5 X-Spamd-Bar: --- X-ThisMailContainsUnwantedMimeParts: N --_000_4bd3e1017a104598ab92e658f25b5367stormshieldeu_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hello, One of our clients is experiencing problems using one of our products. It r= uns FreeBSD 11.3 on an Intel Atom Apollo Lake E3930 two core SoC processor. Occasionally, under very light load, the kernel will panic. I've managed to= get a couple of vmcores and I notice via the backtrace that the MCA interr= upt is called. I've managed to recover two vmcores and I notice in both of them that the I= nter-Processor Interrupts are not being transferred from one CPU to the oth= er. I've also noticed that the structure mca_internal contains information = concerning the state of the MCA status register (value : 0x9000000020000003= ) for bank 0. From Intel's software architecture document, the MCA Error Code is 0x0003 "= The BINIT# from another processor caused this processor to enter machine ch= eck." and the Model Specific Error Code is 0x2000 "1 if BERR is driven." The Intel document is not clear; could anyone please explain what the BINIT= and BERR signals mean? They appear to be related to a bus, but I'm not sur= e which one. A bus external to the Atom SoC or one of the internal buses wi= thin the Atom SoC? Do you have any ideas of what could generate this type of error? Is it like= ly a hardware or a software issue? Thanks in advance. Best wishes, Lee Matthews --_000_4bd3e1017a104598ab92e658f25b5367stormshieldeu_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable

Hello,

One of our clients is experiencing problems using one of our products. It r= uns FreeBSD 11.3 on an Intel Atom Apollo Lake E3930 two core SoC processor.=

Occasionally, under very light load, the kernel will panic. I've managed to= get a couple of vmcores and I notice via the backtrace that the MCA interr= upt is called.

I've managed to recover two vmcores and I notice in both of them that the I= nter-Processor Interrupts are not being transferred from one CPU to the oth= er. I've also noticed that the structure mca_internal contains information = concerning the state of the MCA status register (value : 0x9000000020000003) for bank 0.

From Intel's software architecture document, the MCA Error Code is 0x0003 &= quot;The BINIT# from another processor caused this processor to enter machi= ne check." and the Model Specific Error Code is 0x2000 "1 if BERR= is driven."

The Intel document is not clear; could anyone please explain what the BINIT= and BERR signals mean? They appear to be related to a bus, but I'm not sur= e which one. A bus external to the Atom SoC or one of the internal buses wi= thin the Atom SoC?

Do you have any ideas of what could generate this type of error? Is it like= ly a hardware or a software issue?

Thanks in advance.

Best wishes,
Lee Matthews

--_000_4bd3e1017a104598ab92e658f25b5367stormshieldeu_--