deischen at freebsd.org
Fri Dec 26 18:03:53 UTC 2014
On Fri, 26 Dec 2014, Konstantin Belousov wrote:
> It is somewhat well-known that our libthr.so cannot be loaded
> dynamically into the process. Or rather, it can be, but the
> consequences are catastrophic. We recommend to link any program which
> may load modules, explicitely with -lpthread; the known workaround is
> to do LD_PRELOAD=libthr.so.3 for binaries which were not. I
> implemented support for ld -z nodlopen some time ago, but attempt to
> mark libthr.so as non-loadable caused extreme roar.
> A common opinion is that the proper way to fix the problem is
> to merge the actual code from libthr into libc, leaving libthr as the
> filter to preserve the current ABI. Unfortunately, there are some
> non-trivial and undesirable consequences of doing this.
> First, all pthread mutexes (and other kind of locks) would become
> fully initialized and used even for single-threaded programs, at least
> I do not see a way to work around this. Right now, libc shims for
> pthread_mutex_init() and pthread_mutex_lock(3) are nop. After the
> merge, init needs to allocate memory and lock/unlock operations,
> although uncontested, will start costing one atomic each. In
> particular, malloc(3) and stdio(3) are affected.
> Another very delicate issue is introducing unwanted cancellation
> points into libc functions after libthr wrappers become mandatory.
> This is fixable, but requires lot of mundane work and probably a long
> time to find missed places (i.e. bugs).
> There are probably more problems, and this brings an obvious
> alternative: fix the issues which make dlopen("libthr.so") so
> One known show-stopper is the broken errno after the load. The libthr
> provides the interposer for the errno and all cancellable functions
> from libc. If any interposed symbols have been resolved before the
> libthr.so was loaded, or non-lazy binding mode is requested, the
> bindings cannot be undonde. In particular, references to __error(),
> which implements errno, are bound to return locate of the main thread
> errno variable. Similarly, code referencing cancellable functions
> still gets the uncancellable libc implementations of them.
> Another issue is the recursion between malloc(3) and mutex_init().
> The statically initialized pthread_mutex_t needs some further
> initialization before first use. Jemalloc calls pthread_mutex_init(3)
> for internally-used mutexes, which is nop stub from libc until libthr
> is loaded. After the load, first use of any mutex by malloc(3) leads
> to the thr_mutex.c initialization code, which needs calloc(3). This
> immediately leads to hang due to recursion on some internal libthr
> umtx. Making the lock recursive does not solve the problem, which is
> the infinite mutual recursion between malloc and pthread_mutex_lock()
> for uninitialized malloc mutex.
> Yet another issue is the signal handlers. The libthr routes signal
> delivery through its internal signal handler, to avoid interrupting
> critical sections. Any signal handler installed prior to libthr is
> loaded misses the wrapper, potentially breaking cancellation and
> critical sections.
> Proposed patch does the following:
> - Remove libthr interposers of the libc functions, including
> __error(). Instead, functions calls are indirected through the
> interposing table, similar to how pthread stubs in libc are already
> done. Libc by default points either to syscall trampolines or to
> existing libc implementations. On libthr load, it rewrites the
> pointers to the cancellable implementations already in libthr.
> - Postpone the malloc(3) internal mutexes initialization until libthr
> is loaded.
> - Reinstall signal handlers with wrapper on libthr load.
> The signal handler reinstallation on libthr initialization is only
> needed when libthr.so is dlopened. Performing 128*2 sigaction(2)
> calls on the startup of the binary which is linked to libthr, and thus
> libthr is guaranteed to install proper sighandler wrappers, is huge
> waste. So, I perform the hand-over of signal handlers only for the
> dlopen-ed libthr, which now needs to detect loading at startup
> vs. dlopen. I was unable to distinguish the cases using existing
> facilities, so new private rtld interface is implemented,
> _rtld_is_dlopened(), to query the way library was brought into the
> process address space.
> Without some special measures, static binaries would pull in the whole
> set of the interposed syscalls due to references from the
> interposition table. To fix it, the references are made weak. Also,
> to not pull in the pthread stubs, the interposition table is separate
> from pthreads stubs indirection table.
> The patch is available at
> https://www.kib.kiev.ua/kib/libthr_dlopen.1.patch .
> Among other things, I tested it with the program illustrating the
> issues https://www.kib.kiev.ua/kib/threaded_errno.c .
> Note that you must use matching versions of rtld, libc and libthr.
> Using old ld-elf.so.1 or old libc.so.7 with new libthr.so.3 will
> break the system.
> Work was sponsored by The FreeBSD Foundation.
I took a once-over look at the patch and it looks good.
I never liked the intimacy between malloc and libpthread, but
there's nothing we can currently do about that until mutexes
become real objects instead of pointers.
More information about the freebsd-threads