Debugging rtld

Jeremie Le Hen jeremie at le-hen.org
Sun Jun 8 21:22:42 UTC 2008


Hi Kostik, hackers,

On Sat, May 17, 2008 at 08:46:37PM +0300, Kostik Belousov wrote:
> On Sat, May 17, 2008 at 07:35:25PM +0200, Jeremie Le Hen wrote:
> > Hi,
> > 
> > On Sat, May 17, 2008 at 01:26:53PM +0300, Kostik Belousov wrote:
> > > On Sat, May 17, 2008 at 11:17:40AM +0200, Jeremie Le Hen wrote:
> > > > I tried to compile my source tree with -fstack-protector-all, and it
> > > > happens that rtld breaks with this: once the new rtld is installed every
> > > > single problem coredumps.  I tried to compile rtld-elf without SSP, but
> > > > it didn't solve the problem.  Then I had to compile libc_pic.a without
> > > > SSP and it worked, but I don't understand the root of the problem.
> > > > So I want to use the generated coredump for post-mortem analysis with
> > > > gdb.
> > > > 
> > > > I compiled world with DEBUG_FLAGS=-g.  But GDB gives me a backtrace so
> > > > long that it can't be real.  Moreoever it doesn't seem to bring in the
> > > > required symbols.  I'm a GDB novice, so I'd like some help.
> > > > 
> > > > chroot> ===> libexec/rtld-elf (install)
> > > > chroot> chflags noschg /usr/libexec/ld-elf.so.1
> > > > chroot> install -s -o root -g wheel -m 555  -C -b -fschg -S ld-elf.so.1 /libexec
> > > > chroot> install -o root -g wheel -m 444 rtld.1.gz  /usr/share/man/man1
> > > > chroot> *** Signal 11
> > > > chroot>
> > > > chroot> jarjarbinks# cd /root; ls
> > > > chroot> Segmentation fault
> > > > 
> > > > host> jarjarbinks:145# ls -l /space/chroot/root/ls.core 
> > > > host> -rw-------  1 root  wheel  184320 May 17 10:19 /space/chroot/root/ls.core
> > > > host> jarjarbinks:149# gdb -c /space/chroot/root/ls.core -e /space/chroot/bin/ls
> > > > host> GNU gdb 6.1.1 [FreeBSD]
> > > > host> [...]
> > > > host> This GDB was configured as "i386-marcel-freebsd".
> > > > host> Core was generated by `ls'.
> > > > host> Program terminated with signal 11, Segmentation fault.
> > > > host> #0  0x280583e4 in ?? ()
> > > > host> (gdb) bt
> > > > host> #0  0x280583e4 in ?? ()
> > > > host> #1  0x00000000 in ?? ()
> > > > host> #2  0x00000000 in ?? ()
> > > > host> #3  0x00000000 in ?? ()
> > > > host> #4  0x00000000 in ?? ()
> > > > host> #5  0x00000000 in ?? ()
> > > > host> #6  0x00000000 in ?? ()
> > > > host> #7  0x00000000 in ?? ()
> > > > host> #8  0x00000000 in ?? ()
> > > > host> #9  0x00000000 in ?? ()
> > > > host> #10 0x00000000 in ?? ()
> > > > host> #11 0xffffffff in ?? ()
> > > > host> #12 0x00001000 in ?? ()
> > > > host> [...]
> > > > host> #359 0x73763a68 in ?? ()
> > > > host> #360 0x5b455c3d in ?? ()
> > > > host> [...]
> > > > host> #855 0x00000000 in ?? ()
> > > > host> [...]
> > > > 
> > > > Any hint on how to proceed would be welcome.
> > > 
> > > I usually add the CFLAGS+=-g to the rtld-elf/Makefile. Also, you do not
> > > need to bring down the whole host by the broken ld.so.1. Do not install
> > > it at all, and specify the path to the rtld by the --dynamic-linker switch,
> > > see into ld.
> Hmm,    ^^^^ info
> > 
> > Thank you for this tip.  However the backtrace is still unusable.
> > I've recompiled libc_pic.a with -g, then rtld-elf with -g and finally
> > /bin/ls with you tip and -g.
> > 
> > I'm really brought to a standstill here.
> 
> Looks like you have a stack corruption, that is reasonable given the
> matters you touching. The easiest, althought somewhat time-consuming way
> of searching the point where the things break is to insert some break
> into the code of the rtld, "int3" may be good, and moving it forward
> until you start hitting the breakage instead of the breakpoint.

I've naively added
    asm("int3;");
to the rtld-elf source, but I definitely can't debug rtld as an usual
program.  According to the rtld source, I'd say GDB needs some
cooperation from rtld itself.  Chicken and egg problem here.

I've tested various methods to achieve debugging, such as explicitely
use the "add-symbol-file" GDB command to load ld-elf.so.1 at its entry
point.  Given that the backtrace is unusable, I've tried to dump the
stack starting from $ebp, and I think I could identify one or two
pointers into the text, but GDB says there is nothing to disassemble at
those address.

I don't have enough {linker,loader,gdb}-fu to go on this way.

FWIW, I succeded to understand where is the segmentation fault by
dichotomy placing "assert(NULL)" in the code.  Interestingly, when libc
is compiled *with* -fstack-protector-all (but sys/stack_protector.c),
rtld is compiled *without* stack protection, the call to mmap(2) in
libexec/rtld-elf/i386/reloc.c:reloc_non_plt() triggers a SIGSEGV.

Any hint from a rtld guru would be welcome...

Thanks.
-- 
Jeremie Le Hen
< jeremie at le-hen dot org >< ttz at chchile dot org >


More information about the freebsd-hackers mailing list