(beagleboneblack/urtwn) Kernel page fault with the following non-sleepable locks held [lock order reversal then later Alignment Fault]

Mark Millard markmi at dsl-only.net
Mon Jun 20 01:32:31 UTC 2016


Otacílio otacilio.neto at bsd.com.br wrote on Sun Jun 19 21:40:29 UTC 2016:

> Processing entries:   6%lock order reversal:
>   1st 0xc30b35d4 syncer (syncer) @ /usr/src/sys/kern/vfs_subr.c:2048
>   2nd 0xc319d6f4 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2498

then later:

> Kernel page fault with the following 
> non-sleepable locks held:
> exclusive sleep mutex tcp_sc_head (tcp_sc_head) r = 0 (0xc2f9d520) 
> locked @ /usr/src/sys/netinet/tcp_syncache.c:494
> shared rw tcp (tcp) r = 0 (0xc08ef348) locked @ 
> /usr/src/sys/netinet/tcp_input.c:1034
> exclusive rw tcpinp (tcpinp) r = 0 (0xc31ab494) locked @ 
> /usr/src/sys/netinet/in_pcb.c:1964

then later:

> Fatal kernel mode data abort: 'Alignment Fault' on read
> trapframe: 0xdcfe58a8
> FSR=00000001, FAR=c2e7187a, spsr=60000013
> r0 =c08e7988, r1 =00000004, r2 =c06fa3ad, r3 =000007b6
> r4 =dcfe5a10, r5 =dcfe5b28, r6 =c2e71876, r7 =c2f9d520
> r8 =c2f9d520, r9 =c2e71876, r10=dcfe5b28, r11=dcfe5970
> r12=00000000, ssp=dcfe5938, slr=c2d34370, pc =c053d8e8
> 
> [ thread pid 13 tid 100036 ]
> Stopped at      syncookie_lookup+0x38:  ldmib   r6, {r1-r2}

This was based on 11.0 -r301846.

The last part may be tied to the later fix in -r301872 for network code (quoting https://lists.freebsd.org/pipermail/svn-src-head/2016-June/088339.html ):

> Author: ian
> Date: Mon Jun 13 16:48:27 2016
> New Revision: 301872
> URL: 
> https://svnweb.freebsd.org/changeset/base/301872
> 
> 
> Log:
>   Do not define __NO_STRICT_ALIGNMENT for armv6.  While the requirements
>   are no longer natural-alignment strict, there are still some restrictions.
>   
>   FreeBSD network code assumes data is naturally-aligned or is running
>   on a platform with no restrictions; pointers are not annotated to
>   indicate the data pointed to may be packed or unaligned.  The clang
>   optimizer can sometimes combine the load or store of a pair of adjacent
>   32-bit values into a single doubleword load/store, and that operation
>   requires at least 4-byte alignment.  __NO_STRICT_ALIGNMENT can lead
>   to tcp headers being only 2-byte aligned.
>   
>   Note that alignment faults remain disabled on armv6, this change reverts
>   only the defining of the symbol which leads to some overly-agressive code
>   shortcuts when building common/shared drivers and network code for arm.
>   
>   Approved by:	re(kib)
> 
> Modified:
>   head/sys/arm/include/_types.h
> 
> Modified: head/sys/arm/include/_types.h
> ==============================================================================
> --- head/sys/arm/include/_types.h	Mon Jun 13 11:19:06 2016	(r301871)
> +++ head/sys/arm/include/_types.h	Mon Jun 13 16:48:27 2016	(r301872)
> @@ -43,10 +43,6 @@
>  #error this file needs sys/cdefs.h as a prerequisite
>  #endif
>  
> -#if __ARM_ARCH >= 6
> -#define __NO_STRICT_ALIGNMENT
> -#endif
> -
>  /*
>   * Basic types upon which most other types are built.
>   */

The later ssh related report in "Otacílio otacilio.neto at bsd.com.br Sun Jun 19 21:47:40 UTC 2016" that also includes a matching Alignment fault that may also fit with the fix above. Quoting:

> The kernel panic is totally reproducible. I need only do a ssh in the 
> beaglebone using ptty on windows or ssh on freebsd and the kernel panic 
> is raised.
> 
> FreeBSD/arm (beaglebone) (ttyu0)
> 
> login: Kernel page fault with the following non-sleepable locks held:
> exclusive sleep mutex tcp_sc_head (tcp_sc_head) r = 0 (0xc2f95480) 
> locked @ /usr/src/sys/netinet/tcp_syncache.c:494
> shared rw tcp (tcp) r = 0 (0xc08ef348) locked @ 
> /usr/src/sys/netinet/tcp_input.c:1034
> exclusive rw tcpinp (tcpinp) r = 0 (0xc341e494) locked @ 
> /usr/src/sys/netinet/in_pcb.c:1964
> stack backtrace:
> Fatal kernel mode data abort: 'Alignment Fault' on read
> trapframe: 0xdcfe58a8
> FSR=00000001, FAR=c2e7807a, spsr=60000013
> r0 =c08e7988, r1 =00000004, r2 =c06fa3ad, r3 =000007b6
> r4 =dcfe5a10, r5 =dcfe5b28, r6 =c2e78076, r7 =c2f95480
> r8 =c2f95480, r9 =c2e78076, r10=dcfe5b28, r11=dcfe5970
> r12=00000000, ssp=dcfe5938, slr=c2d34370, pc =c053d8e8
> 
> [ thread pid 13 tid 100036 ]
> Stopped at      syncookie_lookup+0x38:  ldmib   r6, {r1-r2}



The rpi2 that I have access to is busy doing buildworld and buildkernel under WITH_META_MODE for the first time. So I'll not be independently checking that context anytime soon. It is at -r301975 (basically matching ALPHA4's snapshot) but built "no debug" style for the kernel (with debug symbols). The rpi2 is the only supported armv6 context that I currently have access to. [I've not experimented with the ODRIOD-C2 material at https://github.com/tomtor/freebsd/tree/tc2 yet and may not for some time.]


You may want to try -r301872 or later, although only the Alignment fault might be the only change of status.

If -r301872 or later still shows an Alignment fault then Ian Lepore and others likely would be very interested in learning about it as soon as possible.


===
Mark Millard
markmi at dsl-only.net



More information about the freebsd-arm mailing list