Question about rtld-elf. Anyone?.. Anyone?

Terry Lambert tlambert2 at mindspring.com
Tue Apr 29 22:23:49 PDT 2003


Narvi wrote:
> On Tue, 29 Apr 2003, Terry Lambert wrote:
> > I think this is a non-sequitur; what's "sensible" in that case?
> > Should the address space of the fork()'ed process contain the
> > dlopen()'ed object, or not?
> 
> It should either contain the object and the symbols or neither, not a
> in-between situation. And it shouldn't cause a deadlock just because you
> called a function that needs re-entering into the rtld or crashing because
> you called dlsym and got handed a pointer pointing to nowhere.

That's exactly how it works, if you serialize access to thread
unsafe functions, as documented by POSIX, so that the "behaviour
is undefined" never occurs.

> > It seems to me that this situation is a coding error on the part
> > of the person who did not manually serialize access through a
> > pthread mutex, so that the address space was controlled over the
> > fork(), and the resulting process ended up with the state of its
> > address space known to the programmer.
> 
> And exactly why would they have to assume that there is a problem?

Because the standards documents say that the behaviour is
undefined?

> Especially as a library function can call a dlopen() without their
> knowledge (or really for that matter, call fork without their knowledge),
> so there isn't anything they can do either. If it is allowable to use
> functions from libraries you don't have source to in two different threads
> (and I can't imagine there being a serious restriction not to) there is
> nothing that can be done by the programmer, except possibly avoiding
> freebsd.

If it's the threaded libc, it can know whether it needs to
serialize, because it's the source code calling the dangerous
interface.


> > I think it's OK to fail pretty spectacularly if a programmer is
> > silly enough to cause this situation to occur in the first place;
> > the idea that the protection against this type of coding error
> > should be intrinsic to the library ignores the order-of-operation
> > issue, which can only be resolved by the computer reading the
> > programmers mind to determine the intent of the programmer.
> 
> No mind reading is necessary - you have the choice of either delaying
> fork() until the dlopen() completes or reaches some safe stage from which
> cleanup is possible or you can schedule a fork() time cleanup that can
> undo any state. Nothing in dlopen() specs claims it is not fork() or
> threading safe. Non-thread safety would still give you at most one
> concurrent copy and not p[rotection from fork().

You need to read the Corrigenda and the functional overview in
Chapter 12 of the Single UNIX Specification (POSIX, X/Open).  It
says in dlerror(), and I quote:

	Note that this interface is not thread safe, since the
	string may reside in a static area which is overwritten
	whenever an error occurs.  Application code should not
	write to this buffer.  Programs wishing to preserve an
	error message should make their own copies of that
	message.  Depending on the application environment with
	respect to asynchronous execution events, such as signals
	or other asynchronous computation sharing the address
	space, portable applications should use a critical section
	to retrieve the error pointer and buffer.

In fact, it's pretty clear that since this error is potentially
set in place on each invocation of a dl*() function, that this
critical sectioning applies to callers of the function itself.

It's pretty clear that the Aspen Group screwed the pooch on the
dlopen() set of commands; specifically, it's pretty clear that
they did not expect them to be used except in setup or teardown,
and not used subsequently under active operation.

FWIW, there's more in the "Go Solo 2" book, including that these
are expected to be system interfaces, which means that there is
also an expectation of serialization by the kernel, and that a
descriptor reference is set for mmap() (which has the same set
of problems), but which is an implementation detail not under
technical control of the standard (i.e. it doesn't work that way
in FreeBSD, per se, or any BSD-derived system).

-- Terry


More information about the freebsd-threads mailing list