r358252 causes intermittent hangs where processes are stuck sleeping on btalloc
Konstantin Belousov
kostikbel at gmail.com
Sat May 23 10:56:15 UTC 2020
On Fri, May 22, 2020 at 11:46:26PM +0000, Rick Macklem wrote:
> Konstantin Belousov wrote:
> >On Wed, May 20, 2020 at 11:58:50PM -0700, Ryan Libby wrote:
> >> On Wed, May 20, 2020 at 6:04 PM Rick Macklem <rmacklem at uoguelph.ca> wrote:
> >> >
> >> > Hi,
> >> >
> >> > Since I hadn't upgraded a kernel through the winter, it took me a while
> >> > to bisect this, but r358252 seems to be the culprit.
> >> >
> >> > If I do a kernel build over NFS using my not so big Pentium 4 (single core,
> >> > 1.25Gbytes RAM, i386), about every second attempt will hang.
> >> > When I do a "ps" in the debugger, I see processes sleeping on btalloc.
> >> > If I revert to r358251, I cannot reproduce this.
> >> >
> >> > Any ideas?
> >> >
> >> > I can easily test any change you might suggest to see if it fixes the
> >> > problem.
> >> >
> >> > If you want more debug info, let me know, since I can easily
> >> > reproduce it.
> >> >
> >> > Thanks, rick
> >>
> >> Nothing obvious to me. I can maybe try a repro on a VM...
> >>
> >> ddb ps, acttrace, alltrace, show all vmem, show page would be welcome.
> >>
> >> "btalloc" is "We're either out of address space or lost a fill race."
> >
> >Yes, I would be not surprised to be out of something on 1G i386 machine.
> >Please also add 'show alllocks'.
> Ok, I used an up to date head kernel and it took longer to reproduce a hang.
> This time, none of the processes are stuck on "btalloc".
> I'll try and give you most of the above, but since I have to type it in by hand
> from the screen, I might not get it all. (I'm no real typist;-)
> > show alllocks
> exclusive lockmgr ufs (ufs) r = 0 locked @ kern/vfs_subr.c: 3259
> exclusive lockmgr nfs (nfs) r = 0 locked @ kern/vfs_lookup.c:737
> exclusive sleep mutex kernel area domain (kernel arena domain) r = 0 locked @ kern/subr_vmem.c:1343
> exclusive lockmgr bufwait (bufwait) r = 0 locked @ kern/vfs_bio.c:1663
> exclusive lockmgr ufs (ufs) r = 0 locked @ kern/vfs_subr.c:2930
> exclusive lockmgr syncer (syncer) r = 0 locked @ kern/vfs_subr.c:2474
> Process 12 (intr) thread 0x.. (1000008)
> exclusive sleep mutex Giant (Giant) r = 0 locked @ kern/kern_intr.c:1152
>
> > ps
> - Not going to list them all, but here are the ones that seem interesting...
> 18 0 0 0 DL vlruwt 0x11d939cc [vnlru]
> 16 0 0 0 DL (threaded) [bufdaemon]
> 100069 D qsleep [bufdaemon]
> 100074 D - [bufspacedaemon-0]
> 100084 D sdflush 0x11923284 [/ worker]
> - and more of these for the other UFS file systems
> 9 0 0 0 DL psleep 0x1e2f830 [vmdaemon]
> 8 0 0 0 DL (threaded) [pagedaemon]
> 100067 D psleep 0x1e2e95c [dom0]
> 100072 D launds 0x1e2e968 [laundry: dom0]
> 100073 D umarcl 0x12cc720 [uma]
> … a bunch of usb and cam ones
> 100025 D - 0x1b2ee40 [doneq0]
> …
> 12 0 0 0 RL (threaded) [intr]
> 100007 I [swi6: task queue]
> 100008 Run CPU 0 [swi6: Giant taskq]
> …
> 100000 D swapin 0x1d96dfc [swapper]
> - and a bunch more in D state.
> Does this mean the swapper was trying to swap in?
>
> > acttrace
> - just shows the keyboard
> kdb_enter() at kdb_enter+0x35/frame
> vt_kbdevent() at vt_kdbevent+0x329/frame
> kdbmux_intr() at kbdmux_intr+0x19/frame
> taskqueue_run_locked() at taskqueue_run_locked+0x175/frame
> taskqueue_run() at taskqueue_run+0x44/frame
> taskqueue_swi_giant_run(0) at taskqueue_swi_giant_run+0xe/frame
> ithread_loop() at ithread_loop+0x237/frame
> fork_exit() at fork_exit+0x6c/frame
> fork_trampoline() at 0x../frame
>
> > show all vmem
> vmem 0x.. 'transient arena'
> quantum: 4096
> size: 23592960
> inuse: 0
> free: 23592960
> busy tags: 0
> free tags: 2
> inuse size free size
> 16777216 0 0 1 23592960
> vmem 0x.. 'buffer arena'
> quantum: 4096
> size: 94683136
> inuse: 94502912
> free: 180224
> busy tags: 1463
> free tags: 3
> inuse size free size
> 16384 2 32768 1 16384
> 32768 39 1277952 1 32768
> 65536 1422 93192192 0 0
> 131072 0 0 1 131072
> vmem 0x.. 'i386trampoline'
> quantum: 1
> size: 24576
> inuse: 20860
> free: 3716
> busy tags: 9
> free tags: 3
> inuse size free size
> 32 1 48 1 52
> 64 2 208 0 0
> 128 2 280 0 0
> 2048 1 2048 1 3664
> 4096 2 8192 0 0
> 8192 1 10084 0 0
> vmem 0x.. 'kernel rwx arena'
> quantum: 4096
> size: 0
> inuse: 0
> free: 0
> busy tags: 0
> free tags: 0
> vmem 0x.. 'kernel area dom'
> quantum: 4096
> size: 56623104
> inuse: 56582144
> free: 40960
> busy tags: 11224
> free tags: 3
I think this is the trouble.
Did you tried to reduce kern.maxvnodes ? What is the default value for
the knob on your machine ?
We scaled maxvnodes for ZFS and UFS, might be NFS is even more demanding,
having larger node size.
> inuse size free size
> 4096 11091 45428736 0 0
> 8192 63 516096 0 0
> 16384 12 196608 0 0
> 32768 6 196608 0 0
> 40960 2 81920 1 40960
> 65536 16 1048576 0 0
> 94208 1 94208 0 0
> 110592 1 110592 0 0
> 131072 15 2441216 0 0
> 262144 15 3997696 0 0
> 524288 1 524288 0 0
> 1048576 1 1945600 0 0
> vmem 0x.. 'kernel arena'
> quantum: 4096
> size: 390070272
> inuse: 386613248
> free: 3457024
> busy tags: 873
> free tags: 3
> inuse size free size
> 4096 35 143360 1 4096
> 8192 2 16384 2 16384
> 12288 1 12288 0 0
> 16384 30 491520 0 0
> 20480 140 2867200 0 0
> 65536 1 65536 0 0
> 131072 631 82706432 0 0
> 1048576 0 0 1 1339392
> 2097152 27 56623104 1 2097152
> 8388608 1 13774848 0 0
> 16777216 3 74883072 0 0
> 33554432 1 36753408 0 0
> 67108864 1 118276096 0 0
>
> > alltrace
> - I can't face typing too much more, but I'll put a few
> here that look interesting
>
> - for csh
> sched_switch()
> mi_switch()
> kern_yield()
> getblkx()
> breadn_flags()
> ffs_update()
> ufs_inactive()
> VOP_INACTIVE()
> vinactivef()
> vput_final()
> vm_object_deallocate()
> vm_map_process_deferred()
> kern_munmap()
> sys_munmap()
>
> - For cc
> sched_switch()
> mi_switch()
> sleepq_switch()
> sleepq_timedwait()
> _sleep()
> pause_sbt()
> vmem_bt_alloc()
> keg_alloc_slab()
> zone_import()
> cache_alloc()
> cache_alloc_retry()
> uma_zalloc_arg()
> bt_fill()
> vmem_xalloc()
> vmem_alloc()
> kmem_alloc()
> kmem_malloc_domainset()
> page_alloc()
> keg_alloc_slab()
> zone_import()
> cache_alloc()
> cache_alloc_retry()
> uma_zalloc_arg()
> nfscl_nget()
> nfs_create()
> vop_sigdefer()
> nfs_vnodeops_bypass()
> VOP_CREATE_APV()
> vn_open_cred()
> vn_open()
> kern_openat()
> sys_openat()
>
> Then there are a bunch of these... for cc, make
> sched_switch()
> mi_switch()
> sleepq_switch()
> sleepq_catch_signals()
> sleepq_wait_sig()
> kern_wait6()
> sys_wait4()
>
> - for vnlru
> sched_switch()
> mi_switch()
> sleepq_switch()
> sleepq_timedwait()
> _sleep()
> vnlru_proc()
> fork_exit()
> fork_trampoline()
>
> - for syncer
> sched_switch()
> mi_switch()
> critical_exit_preempt()
> intr_event_handle()
> intr_execute_handlers()
> lapic_handle_intr()
> Xapic_isr1()
> - interrupt
> memset()
> cache_alloc()
> cache_alloc_retry()
> uma_zalloc_arg()
> vmem_xalloc()
> vmem_bt_alloc()
> keg_alloc_slab()
> zone_import()
> cache_alloc()
> cache_alloc_retry()
> uma_zalloc_arg()
> bt_fill()
> vmem_xalloc()
> vmem_alloc()
> bufkva_alloc()
> getnewbuf()
> getblkx()
> breadn_flags()
> ffs_update()
> ffs_sync()
> sync_fsync()
> VOP_FSYNC_APV()
> sched_sync()
> fork_exit()
> fork_trampoline()
>
> - For bufdaemon (a bunch of these)
> sched_switch()
> mi_switch()
> sleepq_switch()
> sleepq_timedwait()
> _sleep()
> buf_daemon()
> fork_exit()
> fork_trampoline()
>
> vmdaemon and pagedaemon are basically just like above,
> sleeping in
> vm_daemon()
> or
> vm_pageout_worker()
> or
> vm_pageout_laundry_worker()
> or
> uma_reclaim_worker()
>
> That's all the typing I can take right now.
> I can probably make this happen again if you want more specific stuff.
>
> rick
>
>
>
>
More information about the freebsd-current
mailing list