Re: BINIT and BERR signals in MCA

From: Eugene Grosbein <eugen_at_grosbein.net>
Date: Wed, 12 Apr 2023 03:14:52 UTC
11.04.2023 21:28, Lee MATTHEWS wrote:

> Thanks for getting back to me Eugene.
> 
> 
> On the two cores that I've received, they seem to die at the same point :
> 
> 
> #4  0xffffffff8049a9e3 in panic (fmt=<unavailable>) at ../../../kern/kern_shutdown.c:714
> #5  0xffffffff80780a2b in mca_intr () at ../../../x86/x86/mca.c:1193
> #6  <signal handler called>
> #7  smp_rendezvous_action () at ../../../kern/subr_smp.c:417
> #8  0xffffffff804e5f79 in smp_rendezvous_cpus (map=...,
>     setup_func=0xffffffff804e5e40 <smp_no_rendezvous_barrier>,
>     action_func=0xffffffff80496730 <rm_cleanIPI>,
>     teardown_func=0xffffffff804e5e40 <smp_no_rendezvous_barrier>, arg=0xffffffff80cb5048 <g_conf_lock>)
>     at ../../../kern/subr_smp.c:554
> #9  0xffffffff80496639 in _rm_wlock (rm=0xffffffff80cb5048 <g_conf_lock>)
>     at ../../../kern/kern_rmlock.c:551
> 
> Do you think the temperature could still be an issue? If it were temperature related,
> could one not expect the MCA interrupt to occur during other function calls?

It depends on how much time it spends waiting for events vs. doing other things.