Re: panic: data abort in critical section or under mutex (was: Re: panic: Unknown kernel exception 0 esr_el1 2000000 (on 14-CURRENT/aarch64 Feb 28))

From: Mark Millard <marklmi_at_yahoo.com>
Date: Mon, 07 Mar 2022 01:01:30 UTC
From: Ronald Klop <ronald-lists_at_klop.ws> wrote on
Date: Sun, 6 Mar 2022 23:22:42 +0100 (CET) :

> Did some binary search with kernels from artifact.ci.freebsd.org.
> 
> I suspect "rmlock: Micro-optimize read locking" as cause.
> 
> https://cgit.freebsd.org/src/commit/?id=c84bb8cd771ce4bed58152e47a32dda470bef23a
> 
> 
> And "rmlock: Add required compiler barriers to _rm_runlock()" as solution.
> 
> https://cgit.freebsd.org/src/commit/?id=89ae8eb74e87ac19aa2d7abe4ba16bcccd32bb9f
> 
> 
> So I probably just had a bad day.

Well, there is a report of a buildkernel crash after that pair:

https://lists.freebsd.org/archives/freebsd-arm/2022-March/001078.html

that references additional information at:

http://www.zefox.net/~fbsd/rpi3/crashes/20220304/readme

and reported:

QUOTE
The console connection dropped before the crash (unrelated) I didn't
get the preamble, all  I have is the backtrace and buildkernel log. 
Here's the backtrace:
db> bt
Tracing pid 14795 tid 100098 td 0xffffa00017815600
db_trace_self() at db_trace_self
db_stack_trace() at db_stack_trace+0x11c
db_command() at db_command+0x368
db_command_loop() at db_command_loop+0x54
db_trap() at db_trap+0xf8
kdb_trap() at kdb_trap+0x1cc
handle_el1h_sync() at handle_el1h_sync+0x10
--- exception, esr 0xf2000000
kdb_enter() at kdb_enter+0x44
vpanic() at vpanic+0x1b0
panic() at panic+0x44
data_abort() at data_abort+0x2e8
handle_el1h_sync() at handle_el1h_sync+0x10
--- exception, esr 0x96000004
_rm_rlock_debug() at _rm_rlock_debug+0x8c
sysctl_root_handler_locked() at sysctl_root_handler_locked+0x140
sysctl_root() at sysctl_root+0x1ac
userland_sysctl() at userland_sysctl+0x140
sys___sysctl() at sys___sysctl+0x68
do_el0_sync() at do_el0_sync+0x520
handle_el0_sync() at handle_el0_sync+0x40
--- exception, esr 0x56000000
END QUOTE

The above material does reference _rm_rlock_debug . Might be
related?

The readme reports:

main-n253603-0b25cbc79d3: Thu Mar  3 22:48:31 PST 2022

for the system doing the buildkernel. This is after
89ae8eb74e8 .

(It also mentions another panic earlier in the week,
apparently not reported to the lists at the time.)

===
Mark Millard
marklmi at yahoo.com