"sleeping without queue" ?

Kostik Belousov kostikbel at gmail.com
Wed Jul 23 12:04:02 UTC 2008


On Tue, Jul 22, 2008 at 03:59:57PM -0400, Mikhail Teterin wrote:
> Kostik Belousov написав(ла):
> >On Tue, Jul 22, 2008 at 03:26:29PM -0400, Mikhail Teterin wrote:
> >>Kostik Belousov написав(ла):
> >>>Did you switched to the process before doing backtrace (using the proc 
> >>><pid>
> >>>command)?
> >>Ok, thanks. Did not know about this one. Here:
> >>...
> >>(kgdb) proc 79759
> >>(kgdb) bt
> >>#0  sched_switch (td=0xffffff01286dc000, newtd=0xffffff00010ce000, 
> >>flags=2) at /var/src/sys/kern/sched_4bsd.c:928
> >>#1  0x0000000000000000 in ?? ()
> >>#2  0xffffffff802f1108 in mi_switch (flags=678281216, newtd=0x2) at 
> >>/var/src/sys/kern/kern_synch.c:442
> >>#3  0xffffffff80318513 in sleepq_check_timeout () at 
> >>/var/src/sys/kern/subr_sleepqueue.c:519
> >>#4  0xffffffff80318c85 in sleepq_timedwait (wchan=0xffffffff80688408) at 
> >>/var/src/sys/kern/subr_sleepqueue.c:597
> >>#5  0xffffffff802f16a2 in _sleep (ident=0xffffffff80688408, lock=0x0, 
> >>priority=0, wmesg=0xffffffff804f3059 "vmo_de", timo=1) at 
> >>/var/src/sys/kern/kern_synch.c:224
> >>#6  0xffffffff8043036b in vm_object_deallocate 
> >>(object=0xffffff0053024a90) at /var/src/sys/vm/vm_object.c:509
> >From this frame, please, print the object (like p *object) and
> >likewise, print the object that is at the head of the object->shadow_head
> >list.
> kgdb /usr/obj/var/src/sys/SILVER-SMP/kernel.debug /dev/mem
> [GDB will not be able to debug user-mode threads: 
> /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain 
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "amd64-marcel-freebsd".
> There is no member named pathname.
> Reading symbols from /opt/modules/fuse.ko...done.
> Loaded symbols for /opt/modules/fuse.ko
> Reading symbols from /opt/modules/rtc.ko...done.
> Loaded symbols for /opt/modules/rtc.ko
> Reading symbols from /boot/kernel/snd_ich.ko...Reading symbols from 
> /boot/kernel/snd_ich.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/snd_ich.ko
> Reading symbols from /boot/kernel/msdosfs.ko...Reading symbols from 
> /boot/kernel/msdosfs.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/msdosfs.ko
> #0  0x0000000000000000 in ?? ()
> (kgdb) frame 6
> Error accessing memory address 0x0: Bad address.
> (kgdb) pid 79759
> Undefined command: "pid".  Try "help".
> (kgdb) proc 79759
> (kgdb) frame 6
> #6  0xffffffff8043036b in vm_object_deallocate 
> (object=0xffffff0053024a90) at /var/src/sys/vm/vm_object.c:509
> 509                                             pause("vmo_de", 1);
> (kgdb) p *object
> $1 = {mtx = {lock_object = {lo_name = 0xffffffff804f21c4 "vm object", 
> lo_type = 0xffffffff804f3018 "standard object", lo_flags = 21168128, 
> lo_witness_data = {
>        lod_list = {stqe_next = 0x0}, lod_witness = 0x0}}, mtx_lock = 4, 
> mtx_recurse = 0}, object_list = {tqe_next = 0xffffff0005018a90,
>    tqe_prev = 0xffffff00539a6850}, shadow_head = {lh_first = 
> 0xffffff005d3afa90}, shadow_list = {le_next = 0x0, le_prev = 
> 0xffffff005d2cd048}, memq = {
>    tqh_first = 0xffffff007eb9fa58, tqh_last = 0xffffff007f864820}, root 
> = 0xffffff007ee14d38, size = 427, generation = 66, ref_count = 2, 
> shadow_count = 1,
>  type = 0 '\0', flags = 256, pg_color = 0, paging_in_progress = 0, 
> resident_page_count = 44, backing_object = 0x0, backing_object_offset = 
> 0, pager_object_list = {
>    tqe_next = 0x0, tqe_prev = 0x0}, cache = 0x0, handle = 0x0, un_pager 
> = {vnp = {vnp_size = 576646}, devp = {devp_pglist = {tqh_first = 0x8cc86,
>        tqh_last = 0x0}}, swp = {swp_bcount = 576646}}}
> (kgdb) p (object->shadow_head)
> $2 = {lh_first = 0xffffff005d3afa90}
> (kgdb) p *object->shadow_head.lh_first
> $3 = {mtx = {lock_object = {lo_name = 0xffffffff804f21c4 "vm object", 
> lo_type = 0xffffffff804f3018 "standard object", lo_flags = 21168128, 
> lo_witness_data = {
>        lod_list = {stqe_next = 0x0}, lod_witness = 0x0}}, mtx_lock = 4, 
> mtx_recurse = 0}, object_list = {tqe_next = 0xffffff0066c32340,
>    tqe_prev = 0xffffff012f673ac0}, shadow_head = {lh_first = 0x0}, 
> shadow_list = {le_next = 0x0, le_prev = 0xffffff0053024ad0}, memq = {
>    tqh_first = 0xffffff007779f9a0, tqh_last = 0xffffff0077c04140}, root 
> = 0xffffff0077c04130, size = 387, generation = 3, ref_count = 1, 
> shadow_count = 0,
>  type = 0 '\0', flags = 8452, pg_color = 0, paging_in_progress = 0, 
> resident_page_count = 2, backing_object = 0xffffff0053024a90, 
> backing_object_offset = 163840,
>  pager_object_list = {tqe_next = 0x0, tqe_prev = 0x0}, cache = 0x0, 
> handle = 0x0, un_pager = {vnp = {vnp_size = 365278}, devp = {devp_pglist = {
>        tqh_first = 0x592de, tqh_last = 0x0}}, swp = {swp_bcount = 365278}}}
> 
> 
> >
> >Another question is what scheduler do you use ?
> options         SCHED_4BSD              # 4BSD scheduler
> options         PREEMPTION              # Enable kernel thread preemption
The state of the both object being destroyed and the object that is shadowed
looks right for me. Moreover, the shadowed object is not locked, value
of the mtx_lock is 4. It seems as if the thread missed the wakeup
owed by pause.

John, could it be that the following commit is supposed to fix the issue ?

r179974 | jhb | 2008-06-24 22:36:33 +0300 (Tue, 24 Jun 2008) | 3 lines

MFC: Change the roundrobin implementation in the 4BSD scheduler to trigger a
userland preemption directly from hardclock() via sched_clock()

> 
> >>>Also, show the output of ps axl <pid>.
> >> UID   PID  PPID CPU PRI NI   VSZ   RSS MWCHAN STAT  TT       TIME COMMAND
> >>   0 79759 79758   0  96  0     0    16 -      DE+   p6    0:00,00 
> >>/bin/tcsh -fc 
> >>/meow/ports/editors/openoffice.org-3/work/BEB300_m3/solver/300/unxfbsdx.pro/bin/ma
> >
> >It makes sense to show the whole ps axl output.
> See http://aldan.algebra.com/~mi/tmp/ps-axl.txt -- I edited it for 
> privacy a little bit, but process-states are intact.
> The java-processes in the linuxf have remained unkillable for weeks now 
> -- I even forgot about them. But those are linuxulator problems, whereas 
> the tcsh is native...
It seems that pid 63930 is problematic too ?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20080723/160bcf59/attachment.pgp


More information about the freebsd-stable mailing list