(beagleboneblack/urtwn) Kernel page fault with the following non-sleepable locks held [lock order reversal then later Alignment Fault]
Mark Millard
markmi at dsl-only.net
Mon Jun 20 01:32:31 UTC 2016
Otacílio otacilio.neto at bsd.com.br wrote on Sun Jun 19 21:40:29 UTC 2016:
> Processing entries: 6%lock order reversal:
> 1st 0xc30b35d4 syncer (syncer) @ /usr/src/sys/kern/vfs_subr.c:2048
> 2nd 0xc319d6f4 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2498
then later:
> Kernel page fault with the following
> non-sleepable locks held:
> exclusive sleep mutex tcp_sc_head (tcp_sc_head) r = 0 (0xc2f9d520)
> locked @ /usr/src/sys/netinet/tcp_syncache.c:494
> shared rw tcp (tcp) r = 0 (0xc08ef348) locked @
> /usr/src/sys/netinet/tcp_input.c:1034
> exclusive rw tcpinp (tcpinp) r = 0 (0xc31ab494) locked @
> /usr/src/sys/netinet/in_pcb.c:1964
then later:
> Fatal kernel mode data abort: 'Alignment Fault' on read
> trapframe: 0xdcfe58a8
> FSR=00000001, FAR=c2e7187a, spsr=60000013
> r0 =c08e7988, r1 =00000004, r2 =c06fa3ad, r3 =000007b6
> r4 =dcfe5a10, r5 =dcfe5b28, r6 =c2e71876, r7 =c2f9d520
> r8 =c2f9d520, r9 =c2e71876, r10=dcfe5b28, r11=dcfe5970
> r12=00000000, ssp=dcfe5938, slr=c2d34370, pc =c053d8e8
>
> [ thread pid 13 tid 100036 ]
> Stopped at syncookie_lookup+0x38: ldmib r6, {r1-r2}
This was based on 11.0 -r301846.
The last part may be tied to the later fix in -r301872 for network code (quoting https://lists.freebsd.org/pipermail/svn-src-head/2016-June/088339.html ):
> Author: ian
> Date: Mon Jun 13 16:48:27 2016
> New Revision: 301872
> URL:
> https://svnweb.freebsd.org/changeset/base/301872
>
>
> Log:
> Do not define __NO_STRICT_ALIGNMENT for armv6. While the requirements
> are no longer natural-alignment strict, there are still some restrictions.
>
> FreeBSD network code assumes data is naturally-aligned or is running
> on a platform with no restrictions; pointers are not annotated to
> indicate the data pointed to may be packed or unaligned. The clang
> optimizer can sometimes combine the load or store of a pair of adjacent
> 32-bit values into a single doubleword load/store, and that operation
> requires at least 4-byte alignment. __NO_STRICT_ALIGNMENT can lead
> to tcp headers being only 2-byte aligned.
>
> Note that alignment faults remain disabled on armv6, this change reverts
> only the defining of the symbol which leads to some overly-agressive code
> shortcuts when building common/shared drivers and network code for arm.
>
> Approved by: re(kib)
>
> Modified:
> head/sys/arm/include/_types.h
>
> Modified: head/sys/arm/include/_types.h
> ==============================================================================
> --- head/sys/arm/include/_types.h Mon Jun 13 11:19:06 2016 (r301871)
> +++ head/sys/arm/include/_types.h Mon Jun 13 16:48:27 2016 (r301872)
> @@ -43,10 +43,6 @@
> #error this file needs sys/cdefs.h as a prerequisite
> #endif
>
> -#if __ARM_ARCH >= 6
> -#define __NO_STRICT_ALIGNMENT
> -#endif
> -
> /*
> * Basic types upon which most other types are built.
> */
The later ssh related report in "Otacílio otacilio.neto at bsd.com.br Sun Jun 19 21:47:40 UTC 2016" that also includes a matching Alignment fault that may also fit with the fix above. Quoting:
> The kernel panic is totally reproducible. I need only do a ssh in the
> beaglebone using ptty on windows or ssh on freebsd and the kernel panic
> is raised.
>
> FreeBSD/arm (beaglebone) (ttyu0)
>
> login: Kernel page fault with the following non-sleepable locks held:
> exclusive sleep mutex tcp_sc_head (tcp_sc_head) r = 0 (0xc2f95480)
> locked @ /usr/src/sys/netinet/tcp_syncache.c:494
> shared rw tcp (tcp) r = 0 (0xc08ef348) locked @
> /usr/src/sys/netinet/tcp_input.c:1034
> exclusive rw tcpinp (tcpinp) r = 0 (0xc341e494) locked @
> /usr/src/sys/netinet/in_pcb.c:1964
> stack backtrace:
> Fatal kernel mode data abort: 'Alignment Fault' on read
> trapframe: 0xdcfe58a8
> FSR=00000001, FAR=c2e7807a, spsr=60000013
> r0 =c08e7988, r1 =00000004, r2 =c06fa3ad, r3 =000007b6
> r4 =dcfe5a10, r5 =dcfe5b28, r6 =c2e78076, r7 =c2f95480
> r8 =c2f95480, r9 =c2e78076, r10=dcfe5b28, r11=dcfe5970
> r12=00000000, ssp=dcfe5938, slr=c2d34370, pc =c053d8e8
>
> [ thread pid 13 tid 100036 ]
> Stopped at syncookie_lookup+0x38: ldmib r6, {r1-r2}
The rpi2 that I have access to is busy doing buildworld and buildkernel under WITH_META_MODE for the first time. So I'll not be independently checking that context anytime soon. It is at -r301975 (basically matching ALPHA4's snapshot) but built "no debug" style for the kernel (with debug symbols). The rpi2 is the only supported armv6 context that I currently have access to. [I've not experimented with the ODRIOD-C2 material at https://github.com/tomtor/freebsd/tree/tc2 yet and may not for some time.]
You may want to try -r301872 or later, although only the Alignment fault might be the only change of status.
If -r301872 or later still shows an Alignment fault then Ian Lepore and others likely would be very interested in learning about it as soon as possible.
===
Mark Millard
markmi at dsl-only.net
More information about the freebsd-arm
mailing list