alignment of thread-local storage
dt71 at gmx.com
dt71 at gmx.com
Wed Nov 6 01:43:10 UTC 2013
Starting with revision 191847 of Clang/LLVM, a bus error tends to happen in realloc() under special circumstances.
To reproduce:
(1) Compile the following program and link it with the cURL library (or see (3b)):
#include <sys/types.h>
#include <pwd.h>
int main(void) { getpwuid(0); }
(2) Compile libc (I use -CURRENT), but when compiling jemalloc.c, specifically use Clang, revision >=191847, and use -march=prescott (or similar) and at least -O1.
(3) Run the program with the libc just built. The program will hopefully stop with a bus error.
(3b) If choosing not to link with any library, then run the program through gdb(1). The program will hopefully also hit the bus error.
In other words:
# echo 'CPUTYPE=prescott' >> /etc/make.conf
# echo 'CFLAGS=-g -O1' >> /etc/make.conf
# cd /usr/src/lib/libc && make
# cd
# cat > x.c <<-EOF
#include <sys/types.h>
#include <pwd.h>
int main(void) { getpwuid(0); }
EOF
# clang x.c -L /usr/local/lib -lcurl
# env LD_LIBRARY_PATH=/usr/src/lib/libc ./a.out
The last 2 command lines can also be:
# clang x.c
# env LD_LIBRARY_PATH=/usr/src/lib/libc gdb ./a.out
(gdb) run
The backtrace is:
0x281d4235 in __realloc (ptr=0x282db7d0, size=<optimized out>)
at jemalloc_jemalloc.c:1249
1249 ta->allocated += usize;
(gdb) bt
#0 0x281d4235 in __realloc (ptr=0x282db7d0, size=<optimized out>)
at jemalloc_jemalloc.c:1249
#1 0x2826c119 in yygrowstack (data=0x282f5144) at nsparser.c:411
#2 0x2826b7e6 in _nsyyparse () at nsparser.c:470
#3 0x28276d34 in nss_configure () at /usr/src/lib/libc/net/nsdispatch.c:372
#4 0x28276301 in _nsdispatch (retval=0xbfbfdbe4, disp_tab=0x282dafb4,
database=0x282d4982 "passwd", method_name=0x282d49b1 "getpwuid_r",
defaults=0x282da594) at /usr/src/lib/libc/net/nsdispatch.c:645
#5 0x28254e9d in getpwuid_r (uid=0, pwd=0x282f4f50, buffer=0x28c0c400 "",
bufsize=1024, result=0xbfbfdbe4) at /usr/src/lib/libc/gen/getpwent.c:609
#6 0x28255208 in wrap_getpwuid_r (key=..., pwd=0x282f4f50,
buffer=0x28c0c400 "", bufsize=1024, res=0xbfbfdbe4)
at /usr/src/lib/libc/gen/getpwent.c:686
#7 0x28254fda in getpw (fn=0x282551b0 <wrap_getpwuid_r>, key=...)
at /usr/src/lib/libc/gen/getpwent.c:654
#8 0x282551a3 in getpwuid (uid=0) at /usr/src/lib/libc/gen/getpwent.c:714
#9 0x0804860a in main ()
The current understanding (sort of) of the problem is:
- The __jemalloc_thread_allocated_tls variable is updated using a processor instruction that requires alignment (paddq), that is, as of Clang/LLVM r191847. The variable is defined as:
__thread thread_allocated_t __attribute__((tls_model("initial-exec"))) thread_allocated_tls = {0,0};
- The variable turns out to be insufficiently aligned, having only 4-byte alignment.
<zygoloid> the thread_allocated_tls object *should* be 16-byte aligned
<zygoloid> is it?
<zygoloid> (if not, that's the bug; the generated code looks correct and good)
<zygoloid> in the IR we have @thread_allocated_tls = global ..., align 16
<o11c> how is storage for TLS variables allocated in the first place?
<o11c> compilers like to pretend that they exist magically, but they do not
<zygoloid> yeah, Clang emits the TLS variable as a 16-byte aligned symbol
<o11c> zygoloid: how are TLS variable actually allocated though?
<o11c> it can't be done once at load time like for globals
<o11c> so I'm guessing it must be malloc()ed or something
<o11c> if so, suspect your malloc
So, what could be the bottom line of this? That is, which one is WRONG -- FreeBSD or Clang (or both)?
More information about the freebsd-hackers
mailing list