Re: lang/rust is super slow to build

From: Edward Sanford Sutton, III <>
Date: Mon, 12 Dec 2022 21:20:29 UTC
On 12/12/22 12:27, Pat Maddox wrote:
 > Using poudriere, lang/rust is at 2 hours and counting on my 10-core i9
 > w/ 128 gigs of RAM.
 > Does that sound right? It seems extremely slow to me, but this is my
 > first time building it.
 > How long does it take others to build? What options are you using, or
 > any other suggestions for shortening the time?

   My last build was 6 hours 43 minutes but I am on an older i7-3820 
with 1/2 the cores disabled and likely had CPU+RAM oversaturated with 
unrelated tasks and had more than 1 core working on Rust.
   Short answer for my poudriere stuff in general which also helps Rust 
is I usually set cores per port=cores in system and then set poudriere 
to build 2 to 4 ports simultaneously; running under idprio usually keeps 
system fairly responsive despite overloaded CPU but RAM going to swap or 
running out is more troublesome. Running (up to) 8 jobs on my system 
with 32GB RAM with 1 core per job was never a good default. I find too 
many times that cores are idle due to dependencies blocking future jobs 
and at the end its common I have a big task sitting around being worked 
on alone for a long time. "USE_TMPFS=all" in poudriere.conf speeds up 
the repeated teardown+buildup of build environments a lot but goes 
through my limited RAM much too fast though only on a few bigger ports 
but they always end up trying to build together. Mot all ports are 
compatible with MAKE_JOBS (usually forces it off when that's the case) 
and there are times where less than maximum cores will be in use 
(sometimes only 1) so building multiple ports simultaneously helps make 
better use of CPU. If you have multiple jobs per port and multiple ports 
building simultaneously then its likely you can have CPU busy while 
doing other tasks like tearing down+building up a port; try to balance 
CPU allocation between those 2 #s and you can minimize when either 
limitation leaves resources idle, moreso if oversaturating cores. ccache 
'may' help with future repeat builds but it only accelerates c/c++ and 
not rust compiler commands so will be limited. Using packages compressed 
with best will slow down their creation but takes less disk space and 
can result in faster decompression. tzst instead of txz with maximum 
compression can speedup both package creation and extraction but will 
cost more disk space. Longer answer including what tweaks go where to 
make that happen is below.
   I do set /usr/local/etc/poudriere.d/make.conf with various 
paramaters. Went to tzst instead of txz which builds larger packages but 
figure its a bit faster compressing+decompressing and uses less RAM (a 
common shortage for me) but I haven't properly tested that it helped me. 
Still have the XZ customization in case I switch back; it helped force 
higher compression for smaller packages at the expense of more CPU+RAM 
to make them. If I recall, xz extraction speed was related to how much 
data it reads; smaller but CPU+RAM expensive to make packages extract 
faster (assuming you don't run out of RAM on extraction step). Jobs I 
set based on my limited cores but I think ports tree.

XZ_OPT+= -9e -Mmax

   I use either of the entries below added to 
/usr/local/etc/poudriere.conf to force permitting multiple cores to be 
used per port or per selected ports; either is required for 
MAKE_JOBS_NUMBER in the earlier make.conf to be used. Last I recall, 
poudriere still defaults to 1 CPU per port which makes individual ports 
potentially much slower under the expectation of using other cores for 
other ports. Some ports cannot benefit from 2+ cores at different parts 
of its build stage and sometimes poudriere won't have enough ports 
available due to dependencies or reaching the end of the queue to keep 
all cores busy at 1 core per port.


ALLOW_MAKE_JOBS_PACKAGES="pkg ccache perl* gettext-tools zstd w3m cmake 
gtk* ruby* py* chromium open* *office* binutils *gcc* llvm* mesa-dri 
*web* rust samba* mysql* openblas *jdk* osg* vtk* dcmtk 
plasma5-plasma-desktop node krita firefox thunderbird virtualbox* 
qt5-declarative suitesparse qt5-webengine"

   I generally execute poudriere with `git -C /usr/ports pull 
--ff-only&&idprio 31 poudriere bulk -J2:12 -j local -p local -f 
/root/installed-port-list -f /root/prime-origin` so I start from an 
updated tree, make it a background job to keep system more responsive 
(maybe its not the best way), I limit it to building 2 ports 
simultaneously as 32GB with "USE_TMPFS=all" in poudriere.conf can lead 
to over 1/2 my RAM being used on a single port and going higher likely 
results in swap and crashing builds especially when considering other 
system use like Firefox (best not used or restarted due to excessive RAM 
consumption with many windows+tabs), ":12" likely needs more tuning, I 
try to use my local ports tree and src tree but poudriere may be 
introducing additional checks as a result which slows it down a lot.
   I use two separate files where I can maintain a list of what is on my 
system from a manually typed list of "I want this" which makes it so I 
could easily do a new OS install or fully wipe packages and cleanup 
/usr/local of any debris and get reinstalled back to where I want to be. 
I use `pkg prime-origins > /root/prime-origins` which only lists the 
final ends of the dependency tree even if I want a dependency to always 
be installed too; good way to generate a minimal current list of 
necessary package installs to have your current complete set if you 
haven't properly kept track of all desired packages; not having desired 
dependencies not and such end branches means removing packages or having 
dependencies of end branches change will lead to me potentially 
uninstalling desired packages with `pkg autoremove`.  If a port is 
broken then it gets attempted unless I make sure both lists don't 
attempt to build it and if removed I then have 2 files I have to cleanup 
before poudriere runs again.
   Not sure how much ccache helps, but its faster anytime any port is 
building any compatible c/c++ with minimal added overhead for any 
compile it doesn't already have a compatible cache for. "compiler_check 
= content" in ccache.conf helps keep cache compatible if I rebuild a 
compiler that isn't actually an update but still isn't enough to help 
with new port version extracts to a different path; really seems like 
primary port extraction should take place in a nonversion dependent 
path. I believe it is devel/sccache that would help speed up anything 
that compiles using the rust compiler but thought I heard it was a 
little slower with c/c++ and poudriere/ports tree doesn't have built in 
support for it like it does ccache.

 > Pat