Question about rtld-elf. Anyone?.. Anyone?

Tue Apr 29 22:43:25 PDT 2003

Peter Wemm wrote:
> Narvi wrote:
> > Well, i think we should gurantee something about the sanity of libc
> > internals to the forked process.
> 
> This is getting a bit sidetracked here.  Remember the issue at hand
> is how to protect the ld-elf.so.1 internals in a pthread context.  Not
> how to recover from a fork().
> 
> One way I've seen is to have libc and the respective pthreads libraries
> provide the public access to things like dlopen() etc.  That way, the
> threads package of your choice does its own serialization of the entry
> points into the dynamic linker guts/internals.  As John Polstra said
> earlier, he has some thoughts about how to make the actual lazy symbol
> lookup be thread-safe.

This is actually hard to do, given the way that the ld.so gets
mapped in and referenced by the crt0 code.  I don't actually
know if it would even be possible, even with two levels of weak
symbol indirection.  One really hard problem is a statically
linked threaded program still trying to override the symbols
with weak references to things that aren't there (hence the two
levels of indirection).

One of the "magic" things that's assumed in the SUS, and by the
X/Open and POSIX codification of SVR4 behaviour as "correct UNIX"
(be honest: that's what those standards are) is that there is a
"reserved" low end memory range, well below the link address of
user programs.  This is actually used for the *kernel* to map the
ld.so into the process address space on the processes behalf; it's
why there's a dlopen() available in static binaries on SVR4.

This change is pretty radical, from a BSD perspective, not the
least of which because of the base address space having to be
prereserved, but it would certainly solve the problem.

> If I recall correctly, our old a.out based shared lib implementation did it
> precicely this way.  dlopen() was a function in libc, that called through
> a vector into the guts of ld.so.1.  The dynamic linker itself never provided
> direct call access to this stuff.  Some systems put these public functions
> in a seperate library, -ldl.  The ELF implemetation that we use does, and
> doesn't give the threads library a chance to wrap them.

Yes.  Now it's referenced through the crt0 stuff; this happened
when we did a compiler and tools import, and changed to "the GCC
way" of doing things, in the great ELF switchover.  In many ways,
I miss a.out: ELF has failed to have its capabilities used to a
reasonable extent (pageable code and data sections in the kernel,
etc.).  8-(.

> (And no, this is not an invitation for getting sidetracked on making
> ld-elf.so.1 into libdl.so.1 as a service library, etc etc)
> 
> How would things go if we renamed the ld-elf.so functions to __rtld_dlopen()
> etc and then had libc provide a weak dlopen() function that redirected to
> __rtld_dlopen(), and give libpthread a chance to provide a replacement?
> And of course, deal with making the runtime symbol resolution as John
> suggested in the commit logs.

You would need two weak references, one of which was a weak reference
to the "real" reference (the "fake" functions in the non-dynamic case),
and the other which was a weak reference to the weak reference, to let
the libc_r (or whatever) override the libc reference.  Last time I
checked my linker-foo, there was a problem with doing this in GCC (it
caused a library link error in Archie's work on the Kaffe JNI code,
while using dlopen; the only fix was to statically link an intermediate
library that had the middle symbols in it; maybe this could be jammed
into the crt0 code).

-- Terry