Building the INDEX

Thu Nov 2 13:16:07 UTC 2006

On Thu, Nov 02, 2006 at 10:08:49PM +1100, Edwin Groothuis wrote:
> On Thu, Nov 02, 2006 at 11:54:23AM +0100, Michel Talon wrote:
> > Today someone lent me a Core 2 Duo machine on which i have loaded FreeBSD-64.
> > Nice result is that both my python program (that i have multithreaded - but
> > there is room for improvement) and make index INDEX_JOBS=3 complete
> > building the INDEX in 8 minutes. Of course this program requires python, but
> > make index requires perl, which is not better. In view of this, one may
> 
> Is that with the "make index" called from /usr/ports? The one which
> crawls through about 16000 iterations of bsd.port.mk? And all you
> needed to do was to replace the perl script with a python script?

In fact perl is used in two places in make index. At global scope, there is
one perl script, as you are saying, which collects the information from all
ports and writes the INDEX file. But perl is also used 16000 times for each
port, to run the commande "make describe" here. The real information is
obtained by running make -V <some variables> which are then processed via
perl. I thought that replacing all these forking perls would provide big
improvement, but in fact i discovered that even make -V forks tons of stuff.

To quantify how much make describe is costing, i have modified my python
script to emit make describe for each port. The result is that one loses
20% in total time. Nevertheless the command make index INDEX_JOBS=3 
succeeds in being speedy because it has a lot of parallelism. It runs 3 jobs
in parallel, all writing to separate files, so no time is lost waiting that
an other make finishes, or in synchronisation. Using python, even when
emitting threads,  doesn't give parallelism because there is a "Giant lock"
which ensures that only one thread can execute python code at any time.
The parallelism can only been obtained when those threads launch external
code, or code in modules which release the lock. People use this facility
or asynchronous IO to speed up python programs, because you can intersperse 
execution of python code and the IO. 

In our problem, it happens that the timing is not IO bound at all, contrary 
to what i was thinking, it is essentially compute bound, the computation being
here the execution of "make -V" which needs a lot of work, for example the
parsing of the 6000 lines of bsd.port.mk! Hence you don't earn much using
asynchronous IO techniques. To get an optimal time, you have to launch
without interruption the optimal number of make commands and then parse the
output. On a Core 2 Duo, it seems that the optimal number of threads is 4.
The rest of the treatment, that is collecting the output of the make -V,
and doing all the jobs leading to the INDEX is only the affair of some seconds
for the python script, and i am sure that a perl script would be even faster,
if not as readable. If being *much* faster was desirable, perhaps a way would
be to concoct a pared down  bsd.port.mk, containing only the 
variables relevant to the problem at hand. One may imagine that if make has to
read some tenths or perhaps a hundred of lines it will be much faster than
reading 6000 lines. Of course the challenge is to devise a way to obtain
automatically the pared down version from the full version.

Anyways, this little experiment shows that some problems can be solved by
throwing hardware at them. A Core 2 Duo succeeds at running more than 3 times
faster than a P4 3Ghz, so i suppose that with a quad core one will be able to
build INDEX in 5 minutes. This is really impressive!

-- 

Michel TALON