FreeBSD deadlock (with fork?)
John Baldwin
jhb at freebsd.org
Thu Sep 18 21:38:07 UTC 2008
On Thursday 18 September 2008 12:31:42 am David Naylor wrote:
> Hi,
>
> I have a program that spawns a lot of subprocesses (with pipes open) from
> multiple threads. The problem is the program often deadlocks, but not
> consistently. Sometimes the program can run over 5 times to competition
> without incidence and yet othertimes it locks within a few seconds.
>
> However if I limit the thread count to 1 the problem does not appear to be
> present.
>
> Here are the logs from gdb:
> (gdb) info thread
> 5 Thread 7021c0 (LWP 100203) 0x00000008009a2e8c in _umtx_op_err ()
> at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
> 4 Thread a28480 (LWP 100174) 0x00000008009a2e8c in _umtx_op_err ()
> at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
> 3 Thread a61d80 (LWP 100175) 0x00000008009a2e8c in _umtx_op_err ()
> at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
> 2 Thread a61bc0 (LWP 100176) 0x00000008009a2e8c in _umtx_op_err ()
> at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
> * 1 Thread a61840 (LWP 100177) 0x00000008009a2e8c in _umtx_op_err ()
> at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
>
>
> (gdb) bt
> #0 0x00000008009a2e8c in _umtx_op_err ()
> at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
> #1 0x00000008009a1331 in cond_wait_common (cond=Variable "cond" is not
> available.
This is not waiting on a lock, this is a pthread_condvar_wait() of some sort.
> (gdb) thr 2
> [Switching to thread 2 (Thread a61bc0 (LWP 100176))]#0 0x00000008009a2e8c
in
> _umtx_op_err ()
> at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
> 37 RSYSCALL_ERR(_umtx_op)
> (gdb) bt
> #0 0x00000008009a2e8c in _umtx_op_err ()
> at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
> #1 0x00000008009a1331 in cond_wait_common (cond=Variable "cond" is not
> available.
Simiarly here. I don't think you have a deadlock. I think you have a bug
where you are missing a pthread_condvar_signal() or broadcast or some such.
Or maybe you aren't holding the mutex when doing the signal or broadcast.
--
John Baldwin
More information about the freebsd-current
mailing list