Thread Local Storage

Doug Rabson dfr at nlsystems.com
Mon Mar 29 13:21:02 PST 2004


On Monday 29 March 2004 20:25, Julian Elischer wrote:
> Doug Rabson wrote:
> > I've been spending a bit of time recently familiarising myself with
> > this TLS stuff and trying out a few things. I've been playing with
> > rtld and I have a prototype patch which implements enough TLS
> > support to let a non-threaded program which uses static TLS work.
> > With a tiny bit more work I can have limited support for dynamic
> > TLS as well (not for dlopen'ed modules yet though). Is there a p4
> > tree for this stuff yet? I'd like to check in what I have sometime.
>
> there is a KSE  p4 tree that is curently unused as we have everything
> in CVS at the moment..
>
> > I've also been looking at libpthread and I can see some potential
> > problems with it. Currently libpthread on i386 uses %gs to point at
> > a struct kcb which seems to be a per-kse structure. This structure
> > contains a pointer to a per-thread struct tcb and this pointer is
> > managed by the userland context switch code. Other arches are
> > similar, e.g. ia64 uses $tp to point at struct kcb.
>
> We're ahead of you there :-)
> In fact the spec requires that %gs:0 is the address of a POINTER to
> the per thread stuff.. The kse mailbox that %gs points to has
> reserved a field at this location to be that pointer.
>
> > The problem with TLS is that the i386 ABI needs %gs to point at the
> > TLS storage for the current thread (its a tiny bit more involved
> > than that but that doesn't matter much for the purposed of this
> > discussion). This leads to trouble since it looks like we will end
> > up needing to allocate an LDT segment per thread, leading to an
> > arbitrary limit on the number of threads (~8192).
>
> No, you missed a level of indirection :-) (I did too originally).
> The x86 version of the spec (SUN variant) expects there to be a
> double indirection. ths allows the UTS to keep the pointer up to date
> as to which thread is running on that KSE.

I saw that in the spec and that all seems fine. The problem I have is 
that using the GNU TLS model, %gs must also point at the next byte 
after the TLS storage with negative offsets from %gs accessing the 
storage (glibc gives the %gs segment a 4G size to allow this). In 
theory a static TLS access can reduce to just e.g.:

	movl %gs:x at ntpoff, %eax

to read the value of 'int __thread x'.

>
> > I can think of a couple of possible ways to get around this. One
> > easy way would be to allocate a segment per KSE and call
> > i386_set_ldt from the thread switch. Pretty ugly really and takes a
> > syscall. Another slightly better way would be to lazy-allocate
> > segments when we switch threads and reclaim segments from threads
> > which haven't run recently. This technique would be able to get
> > away with a smaller number of segments which tend to be owned by
> > the threads which run most often.
> >
> > There is a similar issue with libthr but since it already allocates
> > an LDT entry per thread there are no new limitations. Linux has an
> > interesting wrinkle on the libthr solution - they have a GDT per
> > cpu and they pre-allocate three GDT slots for TLS pointers (one for
> > glibc, one for Wine and one spare). The kernel thread switching
> > code fills in these GDT slots on the current cpu with values stored
> > in the pcb-equivalent.
>
> Yes in fact we are looking at switching to something similar..
> a GDT entry per CPU that the UTS plugs with "what I am running now"
> info for that CPU.

How do you do that without an expensive syscall?


More information about the freebsd-threads mailing list