Re: A native armv7 panic during kyua runs: sys/netinet6/exthdr:exthdr -> Fatal kernel mode data abort: 'Alignment Fault' on read

From: Mark Millard <marklmi_at_yahoo.com>
Date: Sat, 05 Aug 2023 06:11:54 UTC
On Aug 4, 2023, at 20:58, Warner Losh <imp@bsdimp.com> wrote:

> It might make sense to work up a patch that skips this test on armv7 after filing a bug (the usual way)....
> 
> Warner

Actually, looking at the backtrace, it seems I've previously
listed the same sort of backtrace structure in:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271759

comment 12. Hans Petter Selasky had been working on that
bugzilla entry. I'll add a note that this time I got it
with the built-in EtherNet instead of the dongle used
previously --and that sys/netinet6/exthdr:exthdr is a
way of producing the panic. [Done.]

In /usr/main-src/tests/sys/netinet6/exthdr.sh , commenting
out one line would disable the specific test (leading
whitespace might not be preserved below):

atf_init_test_cases()
{

#        atf_add_test_case "exthdr"
}

[FYI: All my kyua activity has been for FreeBSD main,
generally targeting contexts with some armv7 code
involved. It is associated with my having been an
tester of early lib32 drafts.]

I already have another commented out line for an armv7
panic (leading whitespace might not be preserved):

# git -C /usr/main-src/ diff tests/sys/net/
diff --git a/tests/sys/net/if_bridge_test.sh b/tests/sys/net/if_bridge_test.sh
index eb3a792df449..dcdac75103cd 100755
--- a/tests/sys/net/if_bridge_test.sh
+++ b/tests/sys/net/if_bridge_test.sh
@@ -675,7 +675,7 @@ atf_init_test_cases()
        atf_add_test_case "delete_with_members"
        atf_add_test_case "mac_conflict"
        atf_add_test_case "stp_validation"
-       atf_add_test_case "gif"
+#      atf_add_test_case "gif"
        atf_add_test_case "mtu"
        atf_add_test_case "vlan"
 }

In the original discovery, having if_bridge.ko already loaded was
important to getting the "gif" panic.

But I've not yet put effort into isolating a cleaner/simpler test
than I got the failure with. Nor have a done a range of comparisons
of differing contexts yet.

There are other armv7 related issues, one in particular
being:

A) All the long timeouts [300s+] are for *.py style tests. (Lots of
   these.)

B) All the *.py style tests that do not have long timeout have one of:

 ->  skipped: comment me to run the test
 ->  skipped: Current architecture 'armv7' not supported
__test_cases_list__  ->  broken: Test program did not exit cleanly
__test_cases_list__  ->  broken: Test case list wrote to stderr

The are about 10 of the "comment me" ones and 1 each of the other
(B) ones, if I remember right.

In other words, basically all the *.py based tests are broken or
skipped as kyua classifies things.

I've no clue yet if (A) is tied to the ports':

cryptography/hazmat/bindings/_openssl.abi3.so

openssl 3 incompatibility or not. But I've only seen the
issue in armv7 contexts so far.

I've spent time today on this issue but have made no progress
on identifying what leads to the kdump/truss-output being as
it is.

If the *.py tests were working, I'd not be surprised to then
find more armv7 panics than is now possible via the kyua tests.

> On Fri, Aug 4, 2023 at 12:59 AM Mark Millard <marklmi@yahoo.com> wrote:
> While discovered via an attempted overall kyua run, the following is
> sufficient to get the crash in my native armv7 context:
> 
> # /usr/bin/kyua test -k /usr/tests/Kyuafile sys/netinet6/exthdr:exthdr
> sys/netinet6/exthdr:exthdr  ->  Fatal kernel mode data abort: 'Alignment Fault' on read
> trapframe: 0xdfb97aa0
> FSR=00000001, FAR=db43ab76, spsr=60000013
> r0 =dfedd000, r1 =dfb97b34, r2 =00000000, r3 =00000000
> r4 =00000000, r5 =00000000, r6 =db43ab76, r7 =db43ab66
> r8 =c096383c, r9 =00000000, r10=db132400, r11=dfb97b60
> r12=00000000, ssp=dfb97b30, slr=c0b4e2c0, pc =c04e6b70
> 
> panic: Fatal abort
> cpuid = 0
> time = 1691131498
> KDB: stack backtrace:
> db_trace_self() at db_trace_self
>          pc = 0xc065f414  lr = 0xc007db80 (db_trace_self_wrapper+0x30)
>          sp = 0xdfb97858  fp = 0xdfb97970
> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
>          pc = 0xc007db80  lr = 0xc031a834 (vpanic+0x140)
>          sp = 0xdfb97978  fp = 0xdfb97998
>          r4 = 0x00000100  r5 = 0x00000000
>          r6 = 0xc07c369a  r7 = 0xc0b32e58
> vpanic() at vpanic+0x140
>          pc = 0xc031a834  lr = 0xc031a6f4 (vpanic)
>          sp = 0xdfb979a0  fp = 0xdfb979a4
>          r4 = 0xdfb97aa0  r5 = 0x00000013
>          r6 = 0xdb43ab76  r7 = 0x00000001
>          r8 = 0x00000001  r9 = 0xdfedd000
>         r10 = 0xdb43ab76
> vpanic() at vpanic
>          pc = 0xc031a6f4  lr = 0xc06849dc (abort_align)
>          sp = 0xdfb979ac  fp = 0xdfb979d8
>          r4 = 0x00000001  r5 = 0x00000001
>          r6 = 0xdfedd000  r7 = 0xdb43ab76
>          r8 = 0xdfb979a4  r9 = 0xc031a6f4
>         r10 = 0xdfb979ac
> abort_align() at abort_align
>          pc = 0xc06849dc  lr = 0xc0684a50 (abort_align+0x74)
>          sp = 0xdfb979e0  fp = 0xdfb979f8
>          r4 = 0x00000013 r10 = 0xdb43ab76
> abort_align() at abort_align+0x74
>          pc = 0xc0684a50  lr = 0xc06846a8 (abort_handler+0x45c)
>          sp = 0xdfb97a00  fp = 0xdfb97a98
>          r4 = 0x00000000 r10 = 0xdb43ab76
> abort_handler() at abort_handler+0x45c
>          pc = 0xc06846a8  lr = 0xc0661cc8 (exception_exit)
>          sp = 0xdfb97aa0  fp = 0xdfb97b60
>          r4 = 0x00000000  r5 = 0x00000000
>          r6 = 0xdb43ab76  r7 = 0xdb43ab66
>          r8 = 0xc096383c  r9 = 0x00000000
>         r10 = 0xdb132400
> exception_exit() at exception_exit
>          pc = 0xc0661cc8  lr = 0xc0b4e2c0 (__pcpu)
>          sp = 0xdfb97b30  fp = 0xdfb97b60
>          r0 = 0xdfedd000  r1 = 0xdfb97b34
>          r2 = 0x00000000  r3 = 0x00000000
>          r4 = 0x00000000  r5 = 0x00000000
>          r6 = 0xdb43ab76  r7 = 0xdb43ab66
>          r8 = 0xc096383c  r9 = 0x00000000
>         r10 = 0xdb132400 r12 = 0x00000000
> in6ifa_ifwithaddr() at in6ifa_ifwithaddr+0x30
>          pc = 0xc04e6b70  lr = 0xc04f9030 (ip6_input+0xd38)
>          sp = 0xdfb97b68  fp = 0xdfb97c28
>          r4 = 0xdb43ab76  r5 = 0xdb43ab5e
>          r6 = 0x00000000  r7 = 0xdb43ab66
> ip6_input() at ip6_input+0xd38
>          pc = 0xc04f9030  lr = 0xc046d66c (netisr_dispatch_src+0xf8)
>          sp = 0xdfb97c30  fp = 0xdfb97c58
>          r4 = 0xdb43ab00  r5 = 0x00000006
>          r6 = 0x00000007  r7 = 0xc0b49d50
>          r8 = 0xdafea0c0  r9 = 0xdb43ab00
>         r10 = 0x00000086
> netisr_dispatch_src() at netisr_dispatch_src+0xf8
>          pc = 0xc046d66c  lr = 0xc04641b0 (ether_demux+0x18c)
>          sp = 0xdfb97c60  fp = 0xdfb97c78
>          r4 = 0x00000006  r5 = 0x00001201
>          r6 = 0xdb132400  r7 = 0x000000ff
>          r8 = 0xdafea0c0  r9 = 0xdb43ab00
>         r10 = 0x00000086
> ether_demux() at ether_demux+0x18c
>          pc = 0xc04641b0  lr = 0xc0465880 (ether_nh_input+0x490)
>          sp = 0xdfb97c80  fp = 0xdfb97ce0
>          r4 = 0xdb132400  r5 = 0xdb43ab00
>          r6 = 0xdb43ab50 r10 = 0x00000086
> ether_nh_input() at ether_nh_input+0x490
>          pc = 0xc0465880  lr = 0xc046d66c (netisr_dispatch_src+0xf8)
>          sp = 0xdfb97ce8  fp = 0xdfb97d10
>          r4 = 0xdb43ab00  r5 = 0x00000005
>          r6 = 0x0000000c  r7 = 0xc0b49d30
>          r8 = 0xdafea0c0  r9 = 0xdb43ab00
>         r10 = 0xc098d18f
> netisr_dispatch_src() at netisr_dispatch_src+0xf8
>          pc = 0xc046d66c  lr = 0xc04645c4 (ether_input+0x50)
>          sp = 0xdfb97d18  fp = 0xdfb97d48
>          r4 = 0xdb43ab00  r5 = 0x00000000
>          r6 = 0x00008803  r7 = 0x00000000
>          r8 = 0xdafea0c0  r9 = 0xdb43ab00
>         r10 = 0xc098d18f
> ether_input() at ether_input+0x50
>          pc = 0xc04645c4  lr = 0xdffb3f08 ($a.10+0x108)
>          sp = 0xdfb97d50  fp = 0xdfb97d78
>          r4 = 0xdb132400  r5 = 0xdaff8b00
>          r6 = 0xdaff8b10  r7 = 0x00000000
>          r8 = 0x00000000 r10 = 0xc098d18f
> $a.10() at $a.10+0x108
>          pc = 0xdffb3f08  lr = 0xc038cb2c (taskqueue_run_locked+0x1c4)
>          sp = 0xdfb97d80  fp = 0xdfb97dd8
>          r4 = 0xe0145100  r5 = 0xdaff8b2c
>          r6 = 0xe0145150  r7 = 0x00000001
>          r8 = 0x00000000  r9 = 0xdfb97d90
>         r10 = 0x00000001
> taskqueue_run_locked() at taskqueue_run_locked+0x1c4
>          pc = 0xc038cb2c  lr = 0xc038e4e4 (taskqueue_thread_loop+0x1b0)
>          sp = 0xdfb97de0  fp = 0xdfb97e10
>          r4 = 0xe0145100  r5 = 0xe0145140
>          r6 = 0xc07af4c4  r7 = 0x00000000
>          r8 = 0xc098d18f  r9 = 0x00000100
>         r10 = 0xc0b228a0
> taskqueue_thread_loop() at taskqueue_thread_loop+0x1b0
>          pc = 0xc038e4e4  lr = 0xc02cdf0c (fork_exit+0xc0)
>          sp = 0xdfb97e18  fp = 0xdfb97e38
>          r4 = 0xdfedd000  r5 = 0xc0b224e0
>          r6 = 0xc038e334  r7 = 0xdffc4f54
>          r8 = 0xdfb97e40  r9 = 0xc098d191
> fork_exit() at fork_exit+0xc0
>          pc = 0xc02cdf0c  lr = 0xc0661c5c (swi_exit)
>          sp = 0xdfb97e40  fp = 0x00000000
>          r4 = 0xc038e334  r5 = 0xdffc4f54
>          r6 = 0xc0b45d84  r7 = 0xd73bcba0
>          r8 = 0x00000001 r10 = 0xc0b228a0
> swi_exit() at swi_exit
>          pc = 0xc0661c5c  lr = 0xc0661c5c (swi_exit)
>          sp = 0xdfb97e40  fp = 0x00000000
> KDB: enter: panic
> [ thread pid 0 tid 100230 ]
> 
> For reference:
> 
> # uname -apKU
> FreeBSD OPiP2E-RPi2v1p1 14.0-CURRENT FreeBSD 14.0-CURRENT armv7 1400093 #6 main-n264334-215bab7924f6-dirty: Tue Jul 25 23:11:39 PDT 2023     root@CA72-16Gp-ZFS:/usr/obj/BUILDs/main-CA7-nodbg-clang/usr/main-src/arm.armv7/sys/GENERIC-NODBG-CA7 arm armv7 1400093 1400093
> 
> The OrangePi+ 2Ed was the type of system booted and tested.
> 


===
Mark Millard
marklmi at yahoo.com