sem(4) lockup in python?

Jan Mikkelsen janm at transactionware.com
Tue Feb 7 15:50:12 UTC 2012


On 06/02/2012, at 3:49 AM, Attilio Rao wrote:

> 2012/2/5 Ivan Voras <ivoras at freebsd.org>:
>> On 5 February 2012 11:44, Garrett Cooper <yanegomi at gmail.com> wrote:
>> 
>>> 
>>>    'make MAKE_JOBS_NUMBER=1' is the workground used right now..
>> 
>> David Xu suggested that it is a bug in Python - it doesn't set
>> process-shared attribute when it calls sem_init(), but i've tried
>> patching it (replacing the port patchfile file the one I've attached)
>> and I still get the hang.
> 
> Guys,
> it would be valuable if you do the following:
> 1) recompile your kernel with INVARIANTS, WITNESS and without WITNESS_SKIPSPIN
> 2a) If you have a serial console, please run the DDB stuff through it
> (go to point 3)
> 2b) If you don't have a serial console please run the DDB stuff in
> textdump (go to point 3)
> 3) Collect the following informations:
> - show allpcpu
> - show alllocks
> - ps
> - alltrace
> 3a) If you had the serial console (thus not textdump) please collect
> the coredump with: call doadump
> 4) reset your machine
> 
> You will end up with the textdump or coredump + all the serial logs
> necessary to debug this.
> If you cannot reproduce your issue with WITNESS enabled, please remove
> from your kernel config and avoid to call 'show alllocks' when in DDB.
> But try to leave INVARIANTS on.
> 
> Hope this helps,
> Attilio


This has just happened again, this time with MAKE_JOBS_NUMBER=1, so that workaround didn't work.

I don't have INVARIANTS or WITNESS compiled in, but I did fire up kgdb to poke around. The stack traces look identical. I don't know what to expect in these structures. If there's anything useful I can dig out here, please let me know.

However: A parent and child process both blocked waiting on semaphores smells like an user level bug to me.

Jan.



(kgdb) proc 24969
[Switching to thread 648 (Thread 101022)]#0  sched_switch (td=0xfffffe003de43000, newtd=0xfffffe000b501000, flags=Variable "flags" is not available.
)
    at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/sched_ule.c:1854
1854			cpuid = PCPU_GET(cpuid);
(kgdb) where
#0  sched_switch (td=0xfffffe003de43000, newtd=0xfffffe000b501000, flags=Variable "flags" is not available.
) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/sched_ule.c:1854
#1  0xffffffff8083af24 in mi_switch (flags=260, newtd=0x0) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/kern_synch.c:448
#2  0xffffffff80872644 in sleepq_catch_signals (wchan=0xfffffe0015fca800, pri=0) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/subr_sleepqueue.c:425
#3  0xffffffff80872fb6 in sleepq_wait_sig (wchan=Variable "wchan" is not available.
) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/subr_sleepqueue.c:631
#4  0xffffffff8083b599 in _sleep (ident=0xfffffe0015fca800, lock=0xffffffff81114860, priority=Variable "priority" is not available.
) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/kern_synch.c:232
#5  0xffffffff8084ac69 in do_sem_wait (td=Variable "td" is not available.
) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/kern_umtx.c:513
#6  0xffffffff8084ad61 in __umtx_op_sem_wait (td=0xfffffe003de43000, uap=0xffffff8693d85bc0) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/kern_umtx.c:3205
#7  0xffffffff80b17de0 in amd64_syscall (td=0xfffffe003de43000, traced=0) at subr_syscall.c:131
#8  0xffffffff80b03517 in Xfast_syscall () at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/amd64/amd64/exception.S:387
#9  0x00000008010277fc in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) proc 24970
[Switching to thread 665 (Thread 100553)]#0  sched_switch (td=0xfffffe02f7240460, newtd=0xfffffe000b501460, flags=Variable "flags" is not available.
)
    at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/sched_ule.c:1854
1854			cpuid = PCPU_GET(cpuid);
(kgdb) where
#0  sched_switch (td=0xfffffe02f7240460, newtd=0xfffffe000b501460, flags=Variable "flags" is not available.
) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/sched_ule.c:1854
#1  0xffffffff8083af24 in mi_switch (flags=260, newtd=0x0) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/kern_synch.c:448
#2  0xffffffff80872644 in sleepq_catch_signals (wchan=0xfffffe0015fd7380, pri=0) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/subr_sleepqueue.c:425
#3  0xffffffff80872fb6 in sleepq_wait_sig (wchan=Variable "wchan" is not available.
) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/subr_sleepqueue.c:631
#4  0xffffffff8083b599 in _sleep (ident=0xfffffe0015fd7380, lock=0xffffffff811145e0, priority=Variable "priority" is not available.
) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/kern_synch.c:232
#5  0xffffffff8084ac69 in do_sem_wait (td=Variable "td" is not available.
) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/kern_umtx.c:513
#6  0xffffffff8084ad61 in __umtx_op_sem_wait (td=0xfffffe02f7240460, uap=0xffffff8694b04bc0) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/kern_umtx.c:3205
#7  0xffffffff80b17de0 in amd64_syscall (td=0xfffffe02f7240460, traced=0) at subr_syscall.c:131
#8  0xffffffff80b03517 in Xfast_syscall () at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/amd64/amd64/exception.S:387
#9  0x00000008010277fc in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) up
#1  0xffffffff8083af24 in mi_switch (flags=260, newtd=0x0) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/kern_synch.c:448
448		sched_switch(td, newtd, flags);
(kgdb) up
#2  0xffffffff80872644 in sleepq_catch_signals (wchan=0xfffffe0015fd7380, pri=0) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/subr_sleepqueue.c:425
425			sleepq_switch(wchan, pri);
(kgdb) up
#3  0xffffffff80872fb6 in sleepq_wait_sig (wchan=Variable "wchan" is not available.
) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/subr_sleepqueue.c:631
631		rcatch = sleepq_catch_signals(wchan, pri);
(kgdb) up
#4  0xffffffff8083b599 in _sleep (ident=0xfffffe0015fd7380, lock=0xffffffff811145e0, priority=Variable "priority" is not available.
) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/kern_synch.c:232
232			rval = sleepq_wait_sig(ident, pri);
(kgdb) up
#5  0xffffffff8084ac69 in do_sem_wait (td=Variable "td" is not available.
) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/kern_umtx.c:513
warning: Source file is more recent than executable.

513		error = msleep(uq, &uc->uc_lock, PCATCH, wmesg, timo);
(kgdb) p *uq
$1 = {uq_link = {tqe_next = 0x0, tqe_prev = 0xfffffe0015fd2080}, uq_key = {hash = 186, type = 2, shared = 0, info = {shared = {object = 0xfffffe00162b0310, offset = 34380812388}, private = {
        vs = 0xfffffe00162b0310, addr = 34380812388}, both = {a = 0xfffffe00162b0310, b = 34380812388}}}, uq_flags = 1, uq_thread = 0xfffffe02f7240460, uq_pi_blocked = 0x0, uq_lockq = {
    tqe_next = 0x0, tqe_prev = 0x0}, uq_pi_contested = {tqh_first = 0x0, tqh_last = 0xfffffe0015fd73d8}, uq_inherited_pri = 255 '?', uq_spare_queue = 0x0, uq_cur_queue = 0xfffffe0015fd2080}
(kgdb) proc 24969
[Switching to thread 648 (Thread 101022)]#0  sched_switch (td=0xfffffe003de43000, newtd=0xfffffe000b501000, flags=Variable "flags" is not available.
)
    at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/sched_ule.c:1854
1854			cpuid = PCPU_GET(cpuid);
(kgdb) up
#1  0xffffffff8083af24 in mi_switch (flags=260, newtd=0x0) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/kern_synch.c:448
448		sched_switch(td, newtd, flags);
(kgdb) up
#2  0xffffffff80872644 in sleepq_catch_signals (wchan=0xfffffe0015fca800, pri=0) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/subr_sleepqueue.c:425
425			sleepq_switch(wchan, pri);
(kgdb) up
#3  0xffffffff80872fb6 in sleepq_wait_sig (wchan=Variable "wchan" is not available.
) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/subr_sleepqueue.c:631
631		rcatch = sleepq_catch_signals(wchan, pri);
(kgdb) up
#4  0xffffffff8083b599 in _sleep (ident=0xfffffe0015fca800, lock=0xffffffff81114860, priority=Variable "priority" is not available.
) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/kern_synch.c:232
232			rval = sleepq_wait_sig(ident, pri);
(kgdb) up
#5  0xffffffff8084ac69 in do_sem_wait (td=Variable "td" is not available.
) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/kern_umtx.c:513
513		error = msleep(uq, &uc->uc_lock, PCATCH, wmesg, timo);
(kgdb) p *uq
$2 = {uq_link = {tqe_next = 0x0, tqe_prev = 0xfffffe04fc73c280}, uq_key = {hash = 194, type = 2, shared = 0, info = {shared = {object = 0xfffffe001628d188, offset = 34380814884}, private = {
        vs = 0xfffffe001628d188, addr = 34380814884}, both = {a = 0xfffffe001628d188, b = 34380814884}}}, uq_flags = 1, uq_thread = 0xfffffe003de43000, uq_pi_blocked = 0x0, uq_lockq = {
    tqe_next = 0x0, tqe_prev = 0x0}, uq_pi_contested = {tqh_first = 0x0, tqh_last = 0xfffffe0015fca858}, uq_inherited_pri = 255 '?', uq_spare_queue = 0x0, uq_cur_queue = 0xfffffe04fc73c280}
(kgdb) 



More information about the freebsd-hackers mailing list