Implementing TLS: step 1

Fri Jun 20 00:18:04 PDT 2003

On Thu, Jun 19, 2003 at 11:30:59PM -0700, Terry Lambert wrote:
> [ ... hacked up Marcel's text in the post a bit ... ]

Cool graphics. Thanks!

> > The compiler generates access sequences according to the runtime
> > specification which in general means that all offsets to the TLS
> > are based on some TLS base address. On ia64 the thread pointer
> > points to the TLS and serves as the TLS base address. On other
> > architectures there may be an indirection. This means that on ia64
> > the lack of TLS still requires us to allocate something for the
> > thread pointer to point to. On other architectures this may not be
> > the case.
> 
> Implementation defined access mechanisms are outside the scope
> of this discussions, since they have not yet been selected.

In fact, they are already architected as part of the psABI (which is
an extention to the psABI in most cases). It's our job to implement
our threading models within the psABI.

> Note(1): I have no idea how this applies to things like function
> pointers with this attribute pointed to functions without it;

There's no problem. It's no different than having a function pointer
on the stack or anywhere else in memory.

> I assume it will "do the right thing", and make seperate data
> elements for the pointers, as directed, *AND* generate code to
> make the calls relative to the TLS for the active thread, which
> could make the implementeion very complicated.

Function pointers are always absolute. The call sequence may need
to construct a displacement, but that's dependent on the IP, not
on where the function pointer was obtained from.

> Note(2): For external global references, one would assume that
> there are scoping issues, i.e. that the external declaration with
> the "__thread" qualifier language extension *MUST* be in scope at
> the time, or, at bes, the symbol decorations will not match, or,
> at worst, everyone who references an out of scope variable like
> this, or, if forced to have a reference in scope, the reference
> fails to also have the "__thread" qualifier, they would get the
> first thread's instance... or even worse, the template instance.

All thread local variables must have TLS specific relocations
attached to them. It the linkers job and also the rtlds job to
validate this. A program is invalid is there's an inconsistency.

> > > I need to go out to the car and get my copy of the TLS proposal....
> > > this supports exec-time linking but does it support run-time (i.e after
> > > exec has begun) linking?
> > 
> > Yes. The rtld will dynamicly construct the TLS template from the
> > images in the ELF files in the startup set and pass this in
> > AT_TLS_* by overriding the values (at least that was the idea).
> 
> This is where I personally have a problem with lazy intialization
> of per thread TLS.  Specifically, when a thread exits, you have to
> know what you have and have not instanced, on a per dynamic object,
> per thread basis, as a minimum granularity, in order to be able to
> clean it up, without trying to clean up things you have not yet
> instanced in that particular thread.  This strikes me as being
> unable to use the %gs "single instruction" shortcuts, which means
> that code generation for a dynamically linked object module would
> nee to know, _apriori_, what kind of references it needed to be
> generating, OR *all* references would have to be via function and
> pointer indirection... meaning that the "single instruction"
> optimization is an illusion that can never happen in reality.

This is where the DTV comes in. It's basicly a vector of TLS block
pointers and each pointer/index corresponds to the TLS block of
a shared library. At cleanup you iterate over the vector and clean
all the non-NULL pointers. This is specific to the dynamic TLS
model, BTW.

-- 
 Marcel Moolenaar	  USPA: A-39004		 marcel at xcllnt.net