From nobody Tue Apr 11 11:59:08 2023 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4PwksT424bz44kwy for ; Tue, 11 Apr 2023 11:59:25 +0000 (UTC) (envelope-from eugen@grosbein.net) Received: from hz.grosbein.net (hz.grosbein.net [IPv6:2a01:4f8:c2c:26d8::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "hz.grosbein.net", Issuer "hz.grosbein.net" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4PwksS0Q5Rz436r for ; Tue, 11 Apr 2023 11:59:23 +0000 (UTC) (envelope-from eugen@grosbein.net) Authentication-Results: mx1.freebsd.org; none Received: from eg.sd.rdtc.ru (root@eg.sd.rdtc.ru [62.231.161.221] (may be forged)) by hz.grosbein.net (8.17.1/8.17.1) with ESMTPS id 33BBxEM3086388 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 Apr 2023 11:59:15 GMT (envelope-from eugen@grosbein.net) X-Envelope-From: eugen@grosbein.net X-Envelope-To: Lee.MATTHEWS.external@stormshield.eu Received: from [10.58.0.11] (dadvw [10.58.0.11] (may be forged)) by eg.sd.rdtc.ru (8.16.1/8.16.1) with ESMTPS id 33BBxEEs001191 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT); Tue, 11 Apr 2023 18:59:14 +0700 (+07) (envelope-from eugen@grosbein.net) Subject: Re: BINIT and BERR signals in MCA To: Lee MATTHEWS , "freebsd-hackers@FreeBSD.org" References: <4bd3e1017a104598ab92e658f25b5367@stormshield.eu> From: Eugene Grosbein Message-ID: <24a51bf0-71de-f596-ef8b-785da4a27fd7@grosbein.net> Date: Tue, 11 Apr 2023 18:59:08 +0700 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 In-Reply-To: <4bd3e1017a104598ab92e658f25b5367@stormshield.eu> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,SHORTCIRCUIT autolearn=disabled version=3.4.6 X-Spam-Report: * -0.0 SHORTCIRCUIT No description available. * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on hz.grosbein.net X-Rspamd-Queue-Id: 4PwksS0Q5Rz436r X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:24940, ipnet:2a01:4f8::/32, country:DE] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-ThisMailContainsUnwantedMimeParts: N 11.04.2023 18:45, Lee MATTHEWS wrote: > Hello, > > One of our clients is experiencing problems using one of our products. It runs FreeBSD 11.3 on an Intel Atom Apollo Lake E3930 two core SoC processor. > > Occasionally, under very light load, the kernel will panic. I've managed to get a couple of vmcores and I notice via the backtrace that the MCA interrupt is called. > > I've managed to recover two vmcores and I notice in both of them that the Inter-Processor Interrupts are not being transferred from one CPU to the other. I've also noticed that the structure mca_internal contains information concerning the state of the MCA status register (value : 0x9000000020000003) for bank 0. > >>From Intel's software architecture document, the MCA Error Code is 0x0003 "The BINIT# from another processor caused this processor to enter machine check." and the Model Specific Error Code is 0x2000 "1 if BERR is driven." > > The Intel document is not clear; could anyone please explain what the BINIT and BERR signals mean? They appear to be related to a bus, but I'm not sure which one. A bus external to the Atom SoC or one of the internal buses within the Atom SoC? > > Do you have any ideas of what could generate this type of error? Is it likely a hardware or a software issue? > > Thanks in advance. > > Best wishes, > Lee Matthews I believe this is some hardware issue, probably over-heating. Did you check for thermal sensor values?