Re: git: 9a2ae72421cd - main - libthr: switch thread and sleepq memory allocator to crt from libc malloc
Date: Tue, 14 Jan 2025 22:28:27 UTC
On Wed, Jan 15, 2025 at 12:19:08AM +0200, Konstantin Belousov wrote:
> On Tue, Jan 14, 2025 at 03:42:52PM -0500, Mark Johnston wrote:
> > On Tue, Jan 14, 2025 at 05:55:15PM +0000, Konstantin Belousov wrote:
> > > The branch main has been updated by kib:
> > >
> > > URL: https://cgit.FreeBSD.org/src/commit/?id=9a2ae72421cd75c741984f63b8c9ee89346a188d
> > >
> > > commit 9a2ae72421cd75c741984f63b8c9ee89346a188d
> > > Author:     Konstantin Belousov <kib@FreeBSD.org>
> > > AuthorDate: 2025-01-14 09:06:58 +0000
> > > Commit:     Konstantin Belousov <kib@FreeBSD.org>
> > > CommitDate: 2025-01-14 17:55:08 +0000
> > >
> > >     libthr: switch thread and sleepq memory allocator to crt from libc malloc
> > >
> > >     There are more complex interactions between malloc and libthr
> > >     initialization that can happen if libthr functions are called from ELF
> > >     object' constructors, before libthr is initialized.  Break the
> > >     dependencies loop by using the private allocator with controlled init.
> > >
> > >     Reported by:    yuri
> > >     Reviewed by:    markj, olce
> > >     Sponsored by:   The FreeBSD Foundation
> > >     MFC after:      1 week
> > >     Differential revision:  https://reviews.freebsd.org/D48454
> > 
> > I see some startup deadlock when running the googletest regression tests
> > (/usr/tests/lib/googletest/gmock_main) after this commit.  gdb (which
> > itself also hangs due to this bug) shows:
> > 
> > (gdb) bt
> > #0  _umtx_op_err () at /home/markj/sb/main/src/lib/libsys/amd64/_umtx_op_err.S:38
> > #1  0x000015e1ba96fd2c in __thr_umutex_lock (mtx=0x15e1ba974468, id=100113) at /usr/src/lib/libthr/thread/thr_umtx.c:69
> > #2  0x000015e1ba966a41 in __thr_calloc (num=1, size=17) at /usr/src/lib/libthr/thread/thr_malloc.c:92
> > #3  0x000015e1ba969213 in mutex_init (mutex=warning: (Internal error: pc 0x15e1bd5c0240 in read in CU, but not in symtab.)
> > warning: (Error: pc 0x15e1bd5c0240 in address map, but not in symtab.)
> 
> The following fixed the issue for me.  I am somewhat surprised that the
> problem did not manifested itself before.
It seems to fix the hang for me as well, thanks.
> commit 783d95d0d6e6e508705cf16cfd9e4a5e2f8db8e4
> Author: Konstantin Belousov <kib@FreeBSD.org>
> Date:   Wed Jan 15 00:11:48 2025 +0200
> 
>     libpthread_init(): ensure curthread == NULL until set explicitly
>     
>     Otherwise libthr::_get_curthread() returns a garbage kept there from
>     allocate_initial_tls(), until libthr initialization proceeds enough to
>     set initial pcb->pcb_thread.  The garbage pcb_thread was dereferenced
>     as struct pthread and some memory read as TID.  Since it was not
>     consistent between reads, thr_malloc_umtx unlock returned EPERM instead
>     of clearing the lock word.
>     
>     Reported by:    markj
>     Sponsored by:   The FreeBSD Foundation
>     MFC after:      1 week
> 
> diff --git a/lib/libthr/thread/thr_init.c b/lib/libthr/thread/thr_init.c
> index 708c425d69c1..e5e438897dee 100644
> --- a/lib/libthr/thread/thr_init.c
> +++ b/lib/libthr/thread/thr_init.c
> @@ -334,6 +334,7 @@ _libpthread_init(struct pthread *curthread)
>  	/* Set the initial thread. */
>  	if (curthread == NULL) {
>  		first = 1;
> +		_tcb_get()->tcb_thread = NULL;
>  		/* Create and initialize the initial thread. */
>  		curthread = _thr_alloc(NULL);
>  		if (curthread == NULL)