NVIDIA and TLS
Julian Elischer
julian at elischer.org
Mon Jun 16 17:30:15 PDT 2003
On Mon, 16 Jun 2003, Gareth Hughes wrote:
> On Mon, 16 Jun 2003, Julian Elischer wrote:
> >
> > I think that the problem is that the access method for TLS is dependent
> > on which library is used.
> >
> > [snip]
> >
> > The trouble is that each of these would require a differnt mechanism to
> > reach TLS and the compiler cannot know ahead of time which one to use.
> >
> > [snip]
> >
> > I may be wrong but I don't think it is a standard yet..
> > especailly for the reason that we see here..
> > It requires that the compiler know what threading library is in use.
> >
> > We could certainly implement efficient TLS code generation for each
> > library, but which one would be compiled in when you compile a .o file
> > that may be used with any library?
>
> Please read Ulrich Drepper's document, if you haven't done so already.
> You'll see that the general case of __thread variable access involves
> a function call to look up the variables address. There are
> optimizations to this access model, that allows one of the other three
> models to be used (ranging from a function call the first time a
> __thread variable is accessed, down to a single instruction per
> access). FreeBSD could trivially implement __thread variables with
> the General Dynamic model (involving a function call per access).
> Our driver uses the Local Exec model (single instruction per access)
> because GNU libc has an optimization on x86 that allows shared
> libraries to use this model, which is normally reserved for
> applications. The key thing is that they're still __thread variables,
> the access model depends on the compile time options used and what's
> available at runtime. Please, I urge you to read Drepper's document
> carefully.
>
I have read most of it already.
What I'm saying is that we can and probably should implement TLS using
the general model to provide TLS at "sane" speed, (e.g. 5 instructions)
but that I don't think we can implement the "1 instruction" version that
you are asking for without breaking the binary compatibility that we
currently have between our 3 pthread libraries. We can currently switch
libraries between 3 very different threads libraries without recompiling
the app or any other libraries involved. in fact we have a config file
to the loader that specifies which one to use at run time **Per
application**. Without using an entrypoint (or maybe self modifying
code) (*EEK!*) I don't see how we can do it and keep that *Very useful*
functionality.
More information about the freebsd-threads
mailing list