[patch] pkg_delete(1) speedup

soralx at cydem.org
Mon Mar 31 00:49:52 PDT 2008

> > You might have noticed a thread on the mailing list called "ports system
> > woes". The submitter pointed out an inefficiency in pkg_delete routine,
> > that parses the whole /var/db/pkg over and over again for every
> > dependency of a package being removed.
> > 
> > Attached is a patch by rdivacky that implements the idea of looking up
> > all the values in a single pass over /var/db/pkg content.
> I hacked a slightly better patch that coveres a part of pkg_add too..
> please review/test on:
> 	www.vlakno.cz/~rdivacky/pkg_tools.patch
> comments, benchmarks more than welcome!

All right, I've been gone to the Real World for a while, but I returned %-)

First, allow me to note that it's rather impressing to see the level of
interest and responses my half-baked email about my little digs into pkg_*
tools produced. Before I even finished thinking whether I will have enough
time to fix the tools proper, patches started appearing on the horizon (the
same day, practically)! This is quite reassuring, as it shows that
developers still care about code and algorithm quality, even if things work
OK on modern hardware (just lack of developer time, that's all, I suppose).
For that I'm grateful -- way to go :)

Now, here are the same tests on the same hardware, but
with pkg_tools.patch applied:

 [root at freen0de /usr/ports/x11/rxvt-unicode]# make
 [root at freen0de /usr/ports/x11/rxvt-unicode]# time make install
 ===>   Generating temporary packing list
 ===>  Checking if x11/rxvt-unicode already installed
 load: 0.53  cmd: pkg_info 25799 [biord] 0.06u 0.07s 0% 1532k
 /usr/bin/install -c -o root -g wheel -d /usr/local/bin
 ===> Documentation installed in /usr/local/share/doc/rxvt-unicode.
 ===>   Compressing manual pages for rxvt-unicode-9.02_1
 ===>   Registering installation for rxvt-unicode-9.02_1
 load: 0.29  cmd: sed 26266 [biord] 0.00u 0.00s 0% 728k
 load: 0.27  cmd: sh 26568 [runnable] 0.00u 0.00s 0% 164k
 load: 0.24  cmd: sh 25951 [biord] 0.14u 0.09s 0% 1228k
 load: 0.22  cmd: grep 27026 [runnable] 0.00u 0.00s 0% 256k
       This port has installed the following binaries which execute with
 real    1m13.885s
 user    0m3.903s
 sys     0m4.870s
 [root at freen0de /usr/ports/x11/rxvt-unicode]# s; sleep 300 && echo -e "<Several memory-intensive jobs performed to clean buffer>\n"
 <Several memory-intensive jobs performed to clean buffer>

 [root at freen0de /usr/ports/x11/rxvt-unicode]# time pkg_delete /var/db/pkg/rxvt-unicode-9.02_1/
 load: 0.36  cmd: pkg_delete 27480 [biord] 0.35u 0.40s 1% 972k

 real    0m37.218s
 user    0m0.448s
 sys     0m0.509s          
 [root at freen0de /usr/ports/x11/rxvt-unicode]# make reinstall > /dev/null; sync
 [root at freen0de /usr/ports/x11/rxvt-unicode]# time pkg_delete /var/db/pkg/rxvt-unicode-9.02_1/

 real    0m20.261s
 user    0m0.349s
 sys     0m0.476s
 [root at freen0de /usr/ports/x11/rxvt-unicode]#

So, the time drops from over 7 minutes to 20 seconds -- sweet! :)

Notice pkg_info in ^T output during "Checking if x11/rxvt-unicode already
installed" phase. This one takes awhile. The actual command is:
`/usr/sbin/pkg_info -q -O x11/rxvt-unicode`
 real    0m37.697s
 user    0m0.125s
 sys     0m0.360s

find_pkgs_by_origin() in info/perform.c uses the same matchbyorigin()
in lib/match.c. What's interesting here, however, is that simple
`time grep ORIGIN /var/db/pkg/*/+CONTENTS` takes ~7 sec (XXX re-test on
that same notebook XXX), while find_pkgs_by_origin() incarnation of
practically the same functionality takes over 30 sec.

To my eye, it doesn't look like matchbyorigin() could be re-implemented
to be faster with little effort, but could somebody have a quick look
as well? Would doing mmap() instead of scanning file line-by-line be
any faster? (though I'm not saying it's a great idea)

BTW, I have a feeling that the "Registering installation" should be made
more verbose. It takes more time that anything else now, and one's left
to wonder what exactly is going on (seems like quite a few different

Also, I found that during the "Checking if <*> already installed" step,
'mtree' (XXX find out exact command here XXX) is called (from bsd.port.mk?),
which can be skipped by setting NO_MTREE. What effect does [not] calling
mtree have?

And to conclude, here we have a benchmark from my faster machine (Core2
Dual 2.72GHz, 2G RAM, MK2018GAS 4200RPM HDD with 2M buffer),
BEFORE patch was applied:

 [root at soralx /usr/ports/x11/rxvt-unicode]# time make install > /dev/null
 real    0m23.097s
 user    0m0.000s
 sys     0m0.219s
 [root at soralx /usr/ports/x11/rxvt-unicode]# time pkg_delete /var/db/pkg/rxvt-unicode-9.02/
 real    0m2.243s
 user    0m0.056s
 sys     0m0.202s
 [root at soralx /usr/ports/x11/rxvt-unicode]# time make reinstall > /dev/null
 real    0m26.867s
 user    0m0.641s
 sys     0m0.936s
 [root at soralx /usr/ports/x11/rxvt-unicode]# time pkg_delete /var/db/pkg/rxvt-2.6.4_3/ #very few depends
 real    0m0.090s
 user    0m0.009s
 sys     0m0.001s
 [root at soralx /usr/ports/x11/rxvt-unicode]# time pkg_delete /var/db/pkg/rxvt-unicode-9.02_1/
 real    0m0.521s
 user    0m0.073s
 sys     0m0.443s

After patching pkg_install/:

 [root at soralx /usr/ports/x11/rxvt-unicode]# time make reinstall > /dev/null
 real    0m3.602s
 user    0m0.534s
 sys     0m0.905s
 [root at soralx /usr/ports/x11/rxvt-unicode]# time pkg_delete /var/db/pkg/rxvt-unicode-9.02_1/
 real    0m0.079s
 user    0m0.033s
 sys     0m0.046s

Not bad, we see improvements in both pkg_install and pkg_delete, as

> roman	

[SorAlx]  ridin' VN1500-B2

