Optimizing RCng execution speed ?

Wed Apr 14 06:50:38 PDT 2004

Peter Jeremy <peterjeremy at optushome.com.au> wrote:
> On Wed, Apr 14, 2004 at 07:54:11AM +0100, Colin Percival wrote:
> >At 21:00 13/04/2004, Peter Jeremy wrote:
> >>If someone wants to have a look at this,
> >>the place to start is to profile the complete system during startup
> >>and see where the time is going.
> >
> >  On my 5.2.1 system (with the mentioned don't-reload-rc.subr patch):
> >Starting RC scripts:                    kern.cp_time: 1 0 44 2 53
> ...
> >RC scripts done:                        kern.cp_time: 71 0 455 5 819
> 
> Overall, that's 61% idle, 33% sys and 6% user.  I suspect the 'idle'
> time is virtually all waiting for disk I/O.  Someone else (green@ ?)
> has commented that increasing parallelism slowed things down - which
> is consistent with system startup being I/O bound.

Yep -- in this case, it was bsdtar, which I was unable to gain any 
performance from by increasing parallelism.  This tells me that I/O is a 
contention point, but I believe it also tells me that I/O read-ahead and 
caching works very effectively.  So for anything else with I/O as a 
contention point, if it is only operating on one disk there probably isn't 
anything to be gained with userland-created parallelism.  That 6% user 
figure shows that sh(1) isn't really the cause of "long" boot-times.  It's 
amazing that anything more than a minute is now too long for some people ;)

> This doesn't seem too unlikely:
> - The system is starting from cold so the filesystem cache is empty.
>   Most read(2) calls and page-faults will require physical I/O.
> - Many of the scripts just spawn a daemon process.  These generally
>   daemonise fairly early and do most of their startup in the background -
>   colliding with all the other daemons doing the same thing.
> 
> It may sound counter-intuitive but adding some judicious short sleeps
> (a few ticks each) during startup could speed things up by reducing the
> disk contention.
> 
> The other option is to add a kernel hook(s) that records physical disk
> reads into a file and then uses that file to pre-load the filesystem
> and VM cache (in sequential order rather than randomly).  This assumes
> enough RAM to be able to load all the necessary disk blocks - but that
> is probably the norm nowadays.

This is something that Mac OS X does, but I don't know how much of the logic 
deciding what goes there is in Darwin to find out which disk blocks are 
deemed cache-worthy.

> Next step is where all the kernel time is going.  This probably means
> running a profiled kernel or ktrace()ing into a ramdisk.

Just on a hunch, someone (TM) should do a "fast-fault" mode that 
makes ld-elf.so.1 fully fault in any libraries it pulls in at boot-time and 
see how that does in populating the cache effectively with the necessary 
libraries instead of paging them in more slowly, with more seeking.

-- 
Brian Fundakowski Feldman                           \'[ FreeBSD ]''''''''''\
  <> green at FreeBSD.org                               \  The Power to Serve! \
 Opinions expressed are my own.                       \,,,,,,,,,,,,,,,,,,,,,,\