Re: Patched gpsd and /dev/pps0 results in "sleeping thread" kernel panic

From: Warner Losh <imp_at_bsdimp.com>
Date: Wed, 01 Sep 2021 03:07:53 UTC
On Tue, Aug 31, 2021 at 8:11 PM Craig Leres <leres@freebsd.org> wrote:

>
> On 8/31/21 6:54 PM, Warner Losh wrote:
> > Can you post a traceback? We recently tightened up the sleeping
> conditions
> > and many of the panics are straight forward to fix
>
> Is this sufficient?
>

Maybe. I'm guessing that the pps driver isn't setting driver_mtx for the
pps_state that it keeps, so when we're sleeping in pps_ioctl we're not
dropping the the ppbus lock, which leads to this diagnostic.

Warner


>                 Craig
>
> toc2 4 # kgdb /boot/kernel/kernel /var/crash/vmcore.1
> GNU gdb (GDB) 10.2 [GDB v10.2 for FreeBSD]
> Copyright (C) 2021 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.
> Type "show copying" and "show warranty" for details.
> This GDB was configured as "x86_64-portbld-freebsd12.2".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> <https://www.gnu.org/software/gdb/bugs/>.
> Find the GDB manual and other documentation resources online at:
>      <http://www.gnu.org/software/gdb/documentation/>.
>
> For help, type "help".
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from /boot/kernel/kernel...
> Reading symbols from /boot/kernel.LBLNET/kernel.debug...
>
> Unread portion of the kernel message buffer:
> Sleeping thread (tid 101408, pid 2046) owns a non-sleepable lock
> KDB: stack backtrace of thread 101408:
> sched_switch() at sched_switch+0x630/frame 0xfffffe0070e4f760
> mi_switch() at mi_switch+0xd4/frame 0xfffffe0070e4f790
> sleepq_catch_signals() at sleepq_catch_signals+0x403/frame
> 0xfffffe0070e4f7e0
> sleepq_timedwait_sig() at sleepq_timedwait_sig+0x14/frame
> 0xfffffe0070e4f820
> _sleep() at _sleep+0x1b3/frame 0xfffffe0070e4f8a0
> pps_ioctl() at pps_ioctl+0x298/frame 0xfffffe0070e4f8f0
> ppsioctl() at ppsioctl+0x48/frame 0xfffffe0070e4f920
> devfs_ioctl() at devfs_ioctl+0xb0/frame 0xfffffe0070e4f970
> VOP_IOCTL_APV() at VOP_IOCTL_APV+0x7b/frame 0xfffffe0070e4f9a0
> vn_ioctl() at vn_ioctl+0x16a/frame 0xfffffe0070e4fab0
> devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfffffe0070e4fad0
> kern_ioctl() at kern_ioctl+0x2b7/frame 0xfffffe0070e4fb30
> sys_ioctl() at sys_ioctl+0xfa/frame 0xfffffe0070e4fc00
> amd64_syscall() at amd64_syscall+0x387/frame 0xfffffe0070e4fd30
> fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0070e4fd30
> --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x8004c899a, rsp =
> 0x7fffdfdfc6a8, rbp = 0x7fffdfdfc730 ---
> panic: sleeping thread
> cpuid = 8
> time = 1630456142
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfffffe005ab73ab0
> vpanic() at vpanic+0x17b/frame 0xfffffe005ab73b00
> panic() at panic+0x43/frame 0xfffffe005ab73b60
> propagate_priority() at propagate_priority+0x282/frame 0xfffffe005ab73b90
> turnstile_wait() at turnstile_wait+0x30c/frame 0xfffffe005ab73be0
> __mtx_lock_sleep() at __mtx_lock_sleep+0x199/frame 0xfffffe005ab73c70
> ppcintr() at ppcintr+0x2a0/frame 0xfffffe005ab73c90
> ithread_loop() at ithread_loop+0x23c/frame 0xfffffe005ab73cf0
> fork_exit() at fork_exit+0x7e/frame 0xfffffe005ab73d30
> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe005ab73d30
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> Uptime: 17m20s
> Dumping 607 out of 12240
> MB:..3%..11%..22%..32%..43%..51%..61%..72%..82%..93%
>
> __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
> 55      /usr/src/sys/amd64/include/pcpu_aux.h: No such file or directory.
> (kgdb) bt
> #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
> #1  doadump (textdump=1) at ../../../kern/kern_shutdown.c:371
> #2  0xffffffff80b83ada in kern_reboot (howto=260)
>      at ../../../kern/kern_shutdown.c:451
> #3  0xffffffff80b83f33 in vpanic (fmt=<optimized out>, ap=<optimized out>)
>      at ../../../kern/kern_shutdown.c:880
> #4  0xffffffff80b83d53 in panic (fmt=<unavailable>)
>      at ../../../kern/kern_shutdown.c:807
> #5  0xffffffff80be7152 in propagate_priority (td=0xfffff80196959740)
>      at ../../../kern/subr_turnstile.c:228
> #6  0xffffffff80be7d1c in turnstile_wait (ts=0xfffff8000357a780,
>      owner=<optimized out>, queue=0) at ../../../kern/subr_turnstile.c:785
> #7  0xffffffff80b62ca9 in __mtx_lock_sleep (c=0xfffff80003969bd0,
>      v=<optimized out>) at ../../../kern/kern_mutex.c:654
> #8  0xffffffff8086fcc0 in ppcintr (arg=0xfffff80003969b00)
>      at ../../../dev/ppc/ppc.c:1546
> #9  0xffffffff80b4637c in intr_event_execute_handlers (p=<optimized out>,
>      ie=0xfffff800030d9d00) at ../../../kern/kern_intr.c:1143
> #10 ithread_execute_handlers (p=<optimized out>, ie=0xfffff800030d9d00)
>      at ../../../kern/kern_intr.c:1156
> #11 ithread_loop (arg=0xfffff800039aa7e0) at ../../../kern/kern_intr.c:1236
> #12 0xffffffff80b42e1e in fork_exit (
>      callout=0xffffffff80b46140 <ithread_loop>, arg=0xfffff800039aa7e0,
>      frame=0xfffffe005ab73d40) at ../../../kern/kern_fork.c:1080
> #13 <signal handler called>
> (kgdb)
>