Parallel Builds
Benjamin Lutz
mail at maxlor.com
Thu Oct 19 09:24:57 UTC 2006
Hello,
Since Multi-core processors are becoming popular (or, more
egocentrically, since I've acquired one), I've become interested in
parallel compilation. Unfortunately, it seems that parallel builds of
any kind are completely unsupported by the ports framework at the
moment. My experimentation with parallel builds has lead to a lot of
build failures: Many ports fail when compiled with gmake -j2 instead of
gmake, and running, say, two portupgrade instances in parallel has them
step on each others often.
As using n processor cores instead of just one gives pretty much an
n-fold speed increase, particularly when compiling C++ code, I'm
interested in investigating what it would take to add some degree of
parallelism to the ports.
I'm sure I'm not the only person that has thought about this. Maybe
there already is an effort to allow for parallelism in port builds. I
therefore would like for people working on this to speak up, or more
generally, to start a discussion on how this could be implemented. I'll
start with giving my own thoughts. I currently see three ways to add
parallism to the ports:
o Mark the ports that allow parallel building by adding a new flag
that can be used in ports makefiles, eg. PARALLEL_BUILDING=yes.
With such a port, the build target would call, say
"gmake -j${PARALLEL_NUM}" instead of just "gmake". PARALLEL_NUM
would be set in /etc/make.conf. If it is undefined, the build
target would fall back to the old behavior. I'll call this
"micro-parallelism" for now.
Advantages:
+ The modification to the ports framework would be relatively small,
since the build tool (make/gmake/whatever) takes care of the
difficult bits like locking.
+ Real build speed increase, particularly with large ports where it
matters most (these usually have non-linear compilation
dependencies, so they can be parallelized).
Disadvantages:
- Each port would have to be marked with PARALLEL_BUILDING=yes
individually. This means more work for the maintainers, and will
mean that introduction of this feature will take time. (On the
other hand, there are only a few ports that are both large and
popular, eg., KDE, adding this feature just for those would already
be a big win.)
For a lot of software it is not obvious whether it supports
parallel building, and it may have a low (but non-zero) probability
for compilation failure with parallel building, leading to ports
being marked with PARALLEL_BUILDING=yes in error, which will lead
to users encountering build failures. (Or maybe that's not that
much of a problem - after reports of build failures come in, the
PARALLEL_BUILDING=yes flag could be removed again, and users that
depend on the build always succeeding could simply not uses
parallel builds. Another idea would be to use
PARALLEL_BUILDING=maybe if the port maintainer is unsure, which
will allow conservative users to use parallel building only for
ports that are guaranteed to compile in parallel.)
- The build speed advantage for ports whose built can't be
parallelized well is small (I believe that stage 1 of the gcc build
would be an example for this). Also, small ports, which spend a
lot of their time (proportionally) in the configure script would
not see much of a speed-up.
o Have the ports framework support building of several ports in
parallel. This could mean that either "make -j2 install" works in
a port directory (so the build of a port's dependencies would happen
in parallel), or that it's possible to run more than one port build
at one time. As above, the amount of parallelism would be
configurable with a variable in /etc/make.conf, and there'd be a
fallback to the old behavior. I'll call this "macro-parallelism".
Advantages:
+ No change needed to the individual ports (probably).
+ Assuming a correct implementation, no increased probability for
build failures.
+ Build speed-up for software consisting of several packages, eg.
KDE, or when installing a new system.
Disadvantages:
- Probably difficult to implement. Locking, build failures and
interruptions would have to be taken care of. Maybe it's not
actually possible to do this with our make(1) (I haven't
properly investigated this yet).
- No speed gain when updating single large ports, eg. gcc. (To be
fair, it must be said that some of the large ports, eg.
OpenOffice.org, don't support micro-parallelism either. Macro-
parallelism would at least allow the otherwise unused CPUs to
do something sometimes.)
o Leave the ports framework as it is, and implement support for
parallel building in add-on tool, eg., portupgrade. The tool would
support automatic parallelism ("portupgrade -a" would automatically
build ports in parallel where possible), or having several
user-created instances running at the same time. I'll call this
"tool-based macro-parallelism".
Advantages:
+ No change needed to the ports at all (at least theoretically, in
practice minor changes might make the development of the build
tool much easier).
+ Assuming a correct implementation, no increased probability for
build failures.
+ Build speed-up for software consisting of several packages, eg.
KDE, or when installing a new system.
Disadvantages:
- Moderately difficult to implement. Locking, build failures and
interruptions would have to be taken care of. I don't see problems
that can't be solved though.
- No speed gain when updating single large ports, eg. gcc. (To be
fair, it must be said that some of the large ports, eg.
OpenOffice.org, don't support micro-parallelism either. Macro-
parallelism would at least allow the otherwise unused CPUs to
do something sometimes.)
A combination of micro- and macro-parallelism seems attractive, since
there are situations where only one of these is supported, but I don't
see how it could work properly (barring a naive approach where you end
up running n^2 processes), since it would require cooperation between
make(1) or the add-on tool and the build tool used by the individual
port and the latter is more or less an unknown.
Phew. That turned into a long email. If you're still reading, thanks!
Cheers
Benjamin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-ports/attachments/20061019/ff4ef451/attachment.pgp
More information about the freebsd-ports
mailing list