[Bug 237195] pthread_mutex_unlock crash as unlocked mutex destroyed by signaled thread

From: <bugzilla-noreply_at_freebsd.org>
Date: Sat, 06 Aug 2022 20:55:45 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237195

--- Comment #11 from longwitz@incore.de ---
The stack trace with samba given in comment #9 is an other problem because tdb
does never use the function pthread_mutex_destroy() for a mutex in
 a shared memory, only in local memory for checking the features of the libthr
library.

I hit the problem discussed in this bug report when I tried to migrate a stable
running IMAP server (cyrus-imapd25 + db48) from FreeBSD 10 to Fre
eBSD 12 r371380. The configure step of db48 finds

  in V10: checking for mutexes... x86_64/gcc-assembly
  in V12: checking for mutexes... POSIX/pthreads/library/x86_64/gcc-assembly

The difference is because FreeBSD 12 supports PTHREAD_PROCESS_SHARED. It comes
out the berkeley db48 software is not compatibel with the share mu
texes in our libthr. When an imap program (e.g. cyr_expire) is finished, db48
calls pthread_mutex_destroy() for all used mutexes. On the next try
 of the IMAP server or of another imap program to get a lock with
pthread_mutex_lock() the error EINVAL is returned and the environment of berkel
ey db is broken.

Newer versions of berkeley db have the same problem. I can simple avoid the
problem with a configure statement for db48 to force only the test-an
d-set mutexes as in V10.

I attach a test program mypshared.c that demonstrates the problem I see with my
IMAP server in V12. If mypshared is started on two or three termi
nals then the problem with pthread_mutex_destroy() can be studied. If in the
program the variable always_destroy is set to 0, then all started pr
ograms run and stop without error, because only the last finishing program
calls pthread_mutex_destroy(). With always_destroy = 1 there are two c
ases when the first program comes to his end:

1. No one of the other programs has the lock. In this case the ending program
calls pthread_mutex_destroy() without error but the other programs
get EINVAL at the next call to pthread_mutex_lock(). That is what I saw running
my IMAP server.

2. One of the other programs helds the lock. Then the ending program calling
pthread_mutex_destroy() aborts with a message like

   Fatal error 'mutex 0x800698000 own 0x8001872a is on list 0x8006861a0 0x0'
   at line 154 in file /usr/src/lib/libthr/thread/thr_mutex.c (errno = 0)

This is caused by the call of mutex_assert_not_owned() at line 481 because
_PTHREADS_INVARIANTS is set in the Makefile.

Both cases are bad, the application can not work.

With higher or lower values for UPPER_BOUND_(NO)LOCK in mypshared.c it is easy
to get examples for both cases. My mutex is always on addr 0x80069
7000, but the error message gives 0x800698000. Instead of "0x8001872a"
sometimes I see "own 0x1872a".

The "first mutex problem" was solved with the commit for revision 297185. A
program of an application must not check if it is the first program u
sing a mutex, A simple call of pthread_mutex_init() is enough, the pthread
library libthr is smart and handles this. Maybe the "last mutex proble
m" can be solved
too with a similar logic. pthread_mutex_destroy() should only destroy the mutex
if the caller is the last user of the mutex. And pthread_mutex_destroy() always
should return without errors like EBUSY. If a user is not the last one and
calls pthread_mutex_destroy() the mutex should not be affected and the users
state should be the same he had before calling pthread_mtx_init().

If we cannot solve the "last mutex problem" then there should be an option to
build the libthr without PTHREAD_PROCESS_SHARED.

-- 
You are receiving this mail because:
You are the assignee for the bug.