Re: Patched gpsd and /dev/pps0 results in "sleeping thread" kernel panic

From: Ian Lepore <ian_at_freebsd.org>
Date: Tue, 31 Aug 2021 21:12:58 -0600
On Tue, 2021-08-31 at 21:07 -0600, Warner Losh wrote:
> On Tue, Aug 31, 2021 at 8:11 PM Craig Leres <leres_at_freebsd.org>
> wrote:
> 
> > 
> > On 8/31/21 6:54 PM, Warner Losh wrote:
> > > Can you post a traceback? We recently tightened up the sleeping
> > conditions
> > > and many of the panics are straight forward to fix
> > 
> > Is this sufficient?
> > 
> 
> Maybe. I'm guessing that the pps driver isn't setting driver_mtx for
> the
> pps_state that it keeps, so when we're sleeping in pps_ioctl we're
> not
> dropping the the ppbus lock, which leads to this diagnostic.
> 
> Warner
> 
> 


Exactly so, I've just been digging through the code.  It's not an easy
fix because the ppbus pps driver doesn't have access to the ppbus mutex
at the point where it's registering to be a pps driver.

Craig, is there any chance you can configure your hardware so that the
PPS signal is connected to either the CTS or DCD pin of a uart (even
the same uart that's communicating with the gps receiver should work if
you use DCD)?  You'll get jitter and accuracy performance comparable to
the parallel port pin, and avoid this mutex/sleeping problem.

-- Ian



> >                 Craig
> > 
> > toc2 4 # kgdb /boot/kernel/kernel /var/crash/vmcore.1
> > GNU gdb (GDB) 10.2 [GDB v10.2 for FreeBSD]
> > Copyright (C) 2021 Free Software Foundation, Inc.
> > License GPLv3+: GNU GPL version 3 or later
> > <http://gnu.org/licenses/gpl.html>
> > This is free software: you are free to change and redistribute it.
> > There is NO WARRANTY, to the extent permitted by law.
> > Type "show copying" and "show warranty" for details.
> > This GDB was configured as "x86_64-portbld-freebsd12.2".
> > Type "show configuration" for configuration details.
> > For bug reporting instructions, please see:
> > <https://www.gnu.org/software/gdb/bugs/>.
> > Find the GDB manual and other documentation resources online at:
> >      <http://www.gnu.org/software/gdb/documentation/>.
> > 
> > For help, type "help".
> > Type "apropos word" to search for commands related to "word"...
> > Reading symbols from /boot/kernel/kernel...
> > Reading symbols from /boot/kernel.LBLNET/kernel.debug...
> > 
> > Unread portion of the kernel message buffer:
> > Sleeping thread (tid 101408, pid 2046) owns a non-sleepable lock
> > KDB: stack backtrace of thread 101408:
> > sched_switch() at sched_switch+0x630/frame 0xfffffe0070e4f760
> > mi_switch() at mi_switch+0xd4/frame 0xfffffe0070e4f790
> > sleepq_catch_signals() at sleepq_catch_signals+0x403/frame
> > 0xfffffe0070e4f7e0
> > sleepq_timedwait_sig() at sleepq_timedwait_sig+0x14/frame
> > 0xfffffe0070e4f820
> > _sleep() at _sleep+0x1b3/frame 0xfffffe0070e4f8a0
> > pps_ioctl() at pps_ioctl+0x298/frame 0xfffffe0070e4f8f0
> > ppsioctl() at ppsioctl+0x48/frame 0xfffffe0070e4f920
> > devfs_ioctl() at devfs_ioctl+0xb0/frame 0xfffffe0070e4f970
> > VOP_IOCTL_APV() at VOP_IOCTL_APV+0x7b/frame 0xfffffe0070e4f9a0
> > vn_ioctl() at vn_ioctl+0x16a/frame 0xfffffe0070e4fab0
> > devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfffffe0070e4fad0
> > kern_ioctl() at kern_ioctl+0x2b7/frame 0xfffffe0070e4fb30
> > sys_ioctl() at sys_ioctl+0xfa/frame 0xfffffe0070e4fc00
> > amd64_syscall() at amd64_syscall+0x387/frame 0xfffffe0070e4fd30
> > fast_syscall_common() at fast_syscall_common+0xf8/frame
> > 0xfffffe0070e4fd30
> > --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x8004c899a, rsp
> > =
> > 0x7fffdfdfc6a8, rbp = 0x7fffdfdfc730 ---
> > panic: sleeping thread
> > cpuid = 8
> > time = 1630456142
> > KDB: stack backtrace:
> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> > 0xfffffe005ab73ab0
> > vpanic() at vpanic+0x17b/frame 0xfffffe005ab73b00
> > panic() at panic+0x43/frame 0xfffffe005ab73b60
> > propagate_priority() at propagate_priority+0x282/frame
> > 0xfffffe005ab73b90
> > turnstile_wait() at turnstile_wait+0x30c/frame 0xfffffe005ab73be0
> > __mtx_lock_sleep() at __mtx_lock_sleep+0x199/frame
> > 0xfffffe005ab73c70
> > ppcintr() at ppcintr+0x2a0/frame 0xfffffe005ab73c90
> > ithread_loop() at ithread_loop+0x23c/frame 0xfffffe005ab73cf0
> > fork_exit() at fork_exit+0x7e/frame 0xfffffe005ab73d30
> > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe005ab73d30
> > --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> > Uptime: 17m20s
> > Dumping 607 out of 12240
> > MB:..3%..11%..22%..32%..43%..51%..61%..72%..82%..93%
> > 
> > __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
> > 55      /usr/src/sys/amd64/include/pcpu_aux.h: No such file or
> > directory.
> > (kgdb) bt
> > #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
> > #1  doadump (textdump=1) at ../../../kern/kern_shutdown.c:371
> > #2  0xffffffff80b83ada in kern_reboot (howto=260)
> >      at ../../../kern/kern_shutdown.c:451
> > #3  0xffffffff80b83f33 in vpanic (fmt=<optimized out>,
> > ap=<optimized out>)
> >      at ../../../kern/kern_shutdown.c:880
> > #4  0xffffffff80b83d53 in panic (fmt=<unavailable>)
> >      at ../../../kern/kern_shutdown.c:807
> > #5  0xffffffff80be7152 in propagate_priority
> > (td=0xfffff80196959740)
> >      at ../../../kern/subr_turnstile.c:228
> > #6  0xffffffff80be7d1c in turnstile_wait (ts=0xfffff8000357a780,
> >      owner=<optimized out>, queue=0) at
> > ../../../kern/subr_turnstile.c:785
> > #7  0xffffffff80b62ca9 in __mtx_lock_sleep (c=0xfffff80003969bd0,
> >      v=<optimized out>) at ../../../kern/kern_mutex.c:654
> > #8  0xffffffff8086fcc0 in ppcintr (arg=0xfffff80003969b00)
> >      at ../../../dev/ppc/ppc.c:1546
> > #9  0xffffffff80b4637c in intr_event_execute_handlers (p=<optimized
> > out>,
> >      ie=0xfffff800030d9d00) at ../../../kern/kern_intr.c:1143
> > #10 ithread_execute_handlers (p=<optimized out>,
> > ie=0xfffff800030d9d00)
> >      at ../../../kern/kern_intr.c:1156
> > #11 ithread_loop (arg=0xfffff800039aa7e0) at
> > ../../../kern/kern_intr.c:1236
> > #12 0xffffffff80b42e1e in fork_exit (
> >      callout=0xffffffff80b46140 <ithread_loop>,
> > arg=0xfffff800039aa7e0,
> >      frame=0xfffffe005ab73d40) at ../../../kern/kern_fork.c:1080
> > #13 <signal handler called>
> > (kgdb)
> > 
Received on Wed Sep 01 2021 - 03:12:58 UTC

Original text of this message