alignment of thread-local storage

dt71 at gmx.com dt71 at gmx.com
Wed Nov 6 01:43:10 UTC 2013


Starting with revision 191847 of Clang/LLVM, a bus error tends to happen in realloc() under special circumstances.


To reproduce:

(1) Compile the following program and link it with the cURL library (or see (3b)):
	#include <sys/types.h>
	#include <pwd.h>
	int main(void) { getpwuid(0); }
(2) Compile libc (I use -CURRENT), but when compiling jemalloc.c, specifically use Clang, revision >=191847, and use -march=prescott (or similar) and at least -O1.
(3) Run the program with the libc just built. The program will hopefully stop with a bus error.
(3b) If choosing not to link with any library, then run the program through gdb(1). The program will hopefully also hit the bus error.

In other words:
# echo 'CPUTYPE=prescott' >> /etc/make.conf
# echo 'CFLAGS=-g -O1' >> /etc/make.conf
# cd /usr/src/lib/libc && make
# cd
# cat > x.c <<-EOF
	#include <sys/types.h>
	#include <pwd.h>
	int main(void) { getpwuid(0); }
	EOF
# clang x.c -L /usr/local/lib -lcurl
# env LD_LIBRARY_PATH=/usr/src/lib/libc ./a.out

The last 2 command lines can also be:
# clang x.c
# env LD_LIBRARY_PATH=/usr/src/lib/libc gdb ./a.out
(gdb) run


The backtrace is:

0x281d4235 in __realloc (ptr=0x282db7d0, size=<optimized out>)
     at jemalloc_jemalloc.c:1249
1249			ta->allocated += usize;
(gdb) bt
#0  0x281d4235 in __realloc (ptr=0x282db7d0, size=<optimized out>)
     at jemalloc_jemalloc.c:1249
#1  0x2826c119 in yygrowstack (data=0x282f5144) at nsparser.c:411
#2  0x2826b7e6 in _nsyyparse () at nsparser.c:470
#3  0x28276d34 in nss_configure () at /usr/src/lib/libc/net/nsdispatch.c:372
#4  0x28276301 in _nsdispatch (retval=0xbfbfdbe4, disp_tab=0x282dafb4,
     database=0x282d4982 "passwd", method_name=0x282d49b1 "getpwuid_r",
     defaults=0x282da594) at /usr/src/lib/libc/net/nsdispatch.c:645
#5  0x28254e9d in getpwuid_r (uid=0, pwd=0x282f4f50, buffer=0x28c0c400 "",
     bufsize=1024, result=0xbfbfdbe4) at /usr/src/lib/libc/gen/getpwent.c:609
#6  0x28255208 in wrap_getpwuid_r (key=..., pwd=0x282f4f50,
     buffer=0x28c0c400 "", bufsize=1024, res=0xbfbfdbe4)
     at /usr/src/lib/libc/gen/getpwent.c:686
#7  0x28254fda in getpw (fn=0x282551b0 <wrap_getpwuid_r>, key=...)
     at /usr/src/lib/libc/gen/getpwent.c:654
#8  0x282551a3 in getpwuid (uid=0) at /usr/src/lib/libc/gen/getpwent.c:714
#9  0x0804860a in main ()


The current understanding (sort of) of the problem is:

- The __jemalloc_thread_allocated_tls variable is updated using a processor instruction that requires alignment (paddq), that is, as of Clang/LLVM r191847. The variable is defined as:
__thread thread_allocated_t __attribute__((tls_model("initial-exec"))) thread_allocated_tls = {0,0};

- The variable turns out to be insufficiently aligned, having only 4-byte alignment.

<zygoloid> the thread_allocated_tls object *should* be 16-byte aligned
<zygoloid> is it?
<zygoloid> (if not, that's the bug; the generated code looks correct and good)
<zygoloid> in the IR we have @thread_allocated_tls = global ..., align 16

<o11c> how is storage for TLS variables allocated in the first place?
<o11c> compilers like to pretend that they exist magically, but they do not

<zygoloid> yeah, Clang emits the TLS variable as a 16-byte aligned symbol

<o11c> zygoloid: how are TLS variable actually allocated though?
<o11c> it can't be done once at load time like for globals
<o11c> so I'm guessing it must be malloc()ed or something
<o11c> if so, suspect your malloc


So, what could be the bottom line of this? That is, which one is WRONG -- FreeBSD or Clang (or both)?


More information about the freebsd-hackers mailing list