rtld optimizations

Kostik Belousov kostikbel at gmail.com
Thu Jan 27 20:59:12 UTC 2011


On Thu, Jan 27, 2011 at 08:50:48PM +0000, Devin Teske wrote:
> On Thu, 2011-01-27 at 22:31 +0200, Kostik Belousov wrote:
> 
> > On Thu, Jan 27, 2011 at 12:37:54PM -0500, Mark Saad wrote:
> > > On Thu, Jan 27, 2011 at 6:05 AM, David Naylor <naylor.b.david at gmail.com> wrote:
> > > > On Wednesday 26 January 2011 06:49:11 Alexander Kabaev wrote:
> > > >> On Tue, 25 Jan 2011 21:40:42 -0500
> > > >>
> > > >> Mark Saad <nonesuch at longcount.org> wrote:
> > > >> > Hello Hackers
> > > >> >
> > > >> > The NetBSD folks have a nice improvement with the rtld-elf subsystem,
> > > >> > known as "Negative Symbol Cache" .
> > > >> >
> > > >> > http://blog.netbsd.org/tnf/entry/netbsd_runtime_linker_gains_negative
> > > >> >
> > > >> >  Roy Marples roy@ has a simple write up of the change.
> > > >> >
> > > >> > I took the basic idea from FreeBSD, but improved the performance
> > > >> > drastically. Basically, the huge win is by caching both breadth and
> > > >> > depth of the needed/weak symbol lookup.
> > > >> > Easiest to think of a,b,c,d as a matrix and FreeBSD just cache a row
> > > >> > where we cache both rows and columns.
> > > >> >
> > > >> > Has anyone looked into porting the changes back to FreeBSD ?  The
> > > >> > improvement on load time for things like firefox, openoffice, and java
> > > >> > is huge on NetBSD. It looks like this change could improve load times
> > > >> > on FreeBSD in the same ways.
> > > >>
> > > >> This is a second time someone posts this to public mailing list and
> > > >> curiously enough is a second time it suggested that someone else is to
> > > >> do the investigation. From the quick look, the commit in question is
> > > >> more or less a direct rip-off of Donelists we had for ages and as
> > > >> such is completely over-hyped. The only extra quirk that said commit
> > > >> does is an optimization of a dlsym() call, which is hardly ever in
> > > >> critical performance path. Said optimization is trivial and easy to
> > > >> try. Here you have it:
> > > >> http://people.freebsd.org/~kan/rtld-symlook-depth.diff
> > > >>
> > > >> Since it only applies to dlsym, it only affects programs that are heavy
> > > >> plugin users, which I suppose is the category OpenOffice and firefox
> > > >> both fall into. Care to do some benchmarks with and without the
> > > >> patch and report the results? I frankly doubt that you'll see any
> > > >> noticeable difference compared to our stock rtld's performance.
> > > >
> > > > I benchmarked the impact said patch has on the boot-time of my system.  I
> > > > timed the boot-time to when KDE launches autostart programs and once all
> > > > programs have loaded (I run a few extra programs, such as amarok).  The latter
> > > > measure requires human action thus it has extra, human, variance in its
> > > > measure.
> > > >
> > > > I tried an older version of rtld (about 2 months old), current version of rtld
> > > > and the new (patched) rtld.  I ran each test three times.  There was little
> > > > variance in the tests and I am confident that there is no difference between
> > > > the different rtld versions and my boot-time.
> > > >
> > > > Here is a summary of my boot times (in seconds).  First measure is when KDE
> > > > autostarts programs, the latter is when I determined when all programs had
> > > > launched.
> > > > rtld-old: 69 96
> > > > rtld:     69 94
> > > > rtld-new: 69 94
> > > >
> > > > Please note that kernel boot time is approximately 10 seconds and kdm is
> > > > delayed by about 10 seconds thus 20 seconds can be removed from above numbers
> > > > to determine non-kernel boot wall-time.
> > > >
> > > > I would like to add that the blog entry claims a substantial improvement for
> > > > some use cases.  Is it not worth to optimism these fringe cases as one mans
> > > > fringe case is another mans normal case (or woman as one prefers)?
> > > >
> > > 
> > > 
> > > So I figured out how to properly fit my foot in my mouth and set out
> > > to retesting this on netbsd.
> > > Turns out that in most cases the speed up is not as dramatic.
> > > 
> > > Firefox 3.6.16 on amd64
> > > 
> > > old ld.elf_so:  4.07 seconds
> > > new ld.elf_so: 3.89 seconds
> > > 
> > > Openoffice 3.1 on amd64
> > > 
> > > old ld.elf_so: 2.67 seconds
> > > new ld.elf_so:  2.60 seconds
> > > 
> > >  I am slightly perturbed that I can start openoffice faster then I can
> > > start firefox, oh well.
> > 
> > Can you, please, satisfy my curiousity ? How did you fixated the moment
> > of finishing the startup of interactive applications like ff or oo ?
> 
> 
> Probably did something like this:
> 
>     time sh -c '( firefox & ); sleep 10000000'
> 
> and then pressed Ctrl-C when he felt that firefox was finished loading.
> The moment Ctrl-C is pressed, time(1) shows how long it ran up until you
> pressed Ctrl-C.
> NOTE: Pressing Ctrl-C will not terminate the firefox instance.

You cannot have 1/100 of seconds precision with this method.
This is why I am asking, seeing < 0.1 seconds difference.
Not to mention some methodical questions, like whether the caches were
warmed before the measurement by several runs before the actual
test.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20110127/c4248904/attachment.pgp


More information about the freebsd-hackers mailing list