assertion when destroying a process shared mutex

Andriy Gapon avg at FreeBSD.org
Fri Sep 20 16:52:26 UTC 2019


Fatal error 'mutex 0x800661000 own 0x80000010 is on list 0x8006591a0 0x0' at
line 153 in file /usr/src/lib/libthr/thread/thr_mutex.c (errno = 0)

This happens with a mutex initialized with PTHREAD_PROCESS_SHARED,
PTHREAD_MUTEX_ROBUST and PTHREAD_MUTEX_ERRORCHECK.
The situation that leads to the abort seems to be this:
- one process takes the lock and then crashes without releasing the lock
- some time later another process does a cleanup and attempts to destroy the mutex
That's where the assertion happens.

Specifically, it seems that the assert is tripped if there are no other
operations on the lock between the crash of one process and the destroy in the
the other process.

I wrote a small test program to demo the issue:
https://people.freebsd.org/~avg/shared_mtx.c

The state of the mutex in a crash dump is this:
(gdb) p/x *(struct pthread_mutex *)0x800661000
$6 = {m_lock = {m_owner = 0x80000010, m_flags = 0x11, m_ceilings = {0x0, 0x0},
m_rb_lnk = 0x0, m_spare = {0x0, 0x0}}, m_flags = 0x1, m_count = 0x0, m_spinloops
= 0x0, m_yieldloops = 0x0, m_ps = 0x2, m_qe = {tqe_next = 0x0,
    tqe_prev = 0x8006591a0}, m_pqe = {tqe_next = 0x0, tqe_prev = 0x0}, m_rb_prev
= 0x0}

So, it's m_qe.tqe_prev != NULL that leads to the assert.

-- 
Andriy Gapon


More information about the freebsd-threads mailing list