svn commit: r227753 - in head: contrib/gdtoa include lib/libc/gdtoa lib/libc/gen lib/libc/locale lib/libc/regex lib/libc/stdio lib/libc/stdlib lib/libc/stdtime lib/libc/string

David Schultz das at freebsd.org
Mon Jan 30 21:26:32 UTC 2012


On Mon, Jan 30, 2012, David Chisnall wrote:
> On 18 Jan 2012, at 19:07, David Schultz wrote:
> 
> > This patch appears to cause a large performance regression.  For
> > example, I measured a 78% slowdown for strtol("    42", ...).
> 
> That's definitely worth taking a closer look at.  I think we can cache some things in TLS and avoid some pthread_getspecific calls.  The current code is the 'make it work' version.  The 'make it fast' version is planned...

Sounds good; I look forward to it.

> > Furthermore, the resulting static binary for a trivial program
> > goes from 7k to 303k, due to pulling in malloc, stdio, and all the
> > pthread stubs.  
> 
> That's not ideal, but I'm not sure if it's avoidable.  Is statically linking libc something people regularly do?

Aside from bde, probably not many.  This is definitely a
second-order concern.

FreeBSD has a set of statically linked binaries in /rescue for
situations where /lib gets screwed up.  Space is an issue there
because the root partition is historically sized quite small.

Embedded folks might also care, but I'll let them speak for
themselves.  I did get a request several years ago from an
embedded developer to unbreak the NO_FLOATING_POINT option in
libc, and you could imagine perhaps a NO_LOCALE option as well.

> Yup.  A quick-and-dirty hack would be to add a flag that was set on the first call to uselocale() and to always use the global locale if this is not set.  That should remove a lot of the overhead in cases where no one uses the per-thread locales.  
>
> We can also probably store the locale in TLS, which (on platforms with fast TLS) should speed up the lookup a bit.  

I thought that's what thread_(get|set)_locale already did.
Actually, it's counterintuitive that it would be significantly
slower to access per-thread state than global state.  Any idea
why?  Maybe it says something about our pthread_getspecific()
implementation.  I will run the code through a profiler some
day, but I don't have the cycles right now.


More information about the svn-src-all mailing list