Puzzling stack trace

Peter Steele psteele at maxiscale.com
Fri Mar 26 12:33:13 UTC 2010

I'm reposting this here since it's a pretty low-level discussion. Hopefully someone here can explain what's going on.

We had an app crash and the resulting core dump produced a very puzzling stack trace:

#0 0x00000008011d438c in thr_kill () from /lib/libc.so.7

#1 0x00000008012722bb in abort () from /lib/libc.so.7

#2 0x00000008011fb70c in malloc_usable_size () from /lib/libc.so.7

#3 0x00000008011fbb95 in malloc_usable_size () from /lib/libc.so.7

#4 0x00000008011fdaea in _malloc_thread_cleanup () from /lib/libc.so.7

#5 0x00000008011fdc86 in _malloc_thread_cleanup () from /lib/libc.so.7

#6 0x00000008011fc8e9 in malloc_usable_size () from /lib/libc.so.7

#7 0x00000008011fccc7 in malloc_usable_size () from /lib/libc.so.7

#8 0x00000008011ffe8f in malloc () from /lib/libc.so.7

#9 0x000000080127374b in memchr () from /lib/libc.so.7

#10 0x000000080125e6e9 in __srget () from /lib/libc.so.7

#11 0x00000008012352dd in vsscanf () from /lib/libc.so.7

#12 0x0000000801220087 in fscanf () from /lib/libc.so.7

This trace resulted from a call to fscanf, as follows:

char buffer[21];

fscanf(in, "%20s", buffer);

We've verified that the data being read was correct, and clearly the buffer in which fscanf is storing the string it reads is valid (i.e., it's not NULL). So what would lead this fscanf() call into calling abort()? Everything seems to be in order. What's more puzzling to us is that we've looked for calls to malloc_usable_size() in the libc sources and although the function is defined we can find no direct call to the function in our FBSD 8 sources:

$ grep -R 'malloc_usable_size' *|grep -v .svn
libc/stdlib/Symbol.map: malloc_usable_size;
libc/stdlib/Makefile.inc:       malloc.3 realloc.3 malloc.3 reallocf.3 malloc.3 malloc_usable_size.3
libc/stdlib/malloc.c:malloc_usable_size(const void *ptr)

That's it. Nothing calls this function from what we can tell. Even if something did call it, we don't understand why it would call abort(). It has an assert:

malloc_usable_size(const void *ptr)
        assert(ptr != NULL);
        return (isalloc(ptr));

but the pointer we pass to fscanf() is clearly not NULL, so what pointer would this function be testing?

It's all very puzzling and we cannot reproduce this failure. We'd like to understand what happened though.

More information about the freebsd-hackers mailing list