RE: How poudriere's PACKAGE_FETCH_WHITELIST should work?

From: Mark Millard <marklmi_at_yahoo.com>
Date: Wed, 15 Feb 2023 21:02:26 UTC
Miroslav Lachman <000.fbsd_at_quip.cz> wrote on
Date: Wed, 15 Feb 2023 19:50:59 UTC :

> Poudriere and fetching build dependecies - I would like to know how it 
> is supposed to work?
> 
> I use poudriere a bunch of years, with ports overlays etc. As I need to 
> rebuild packages for our machines many times a month I am tired of 
> building huge and slow packages like llvm, gcc, rust (as they often eat 
> all memory+swap then build is killed by OOM).

You may or may not have run into the following sort of
thing.

Using rust as an example: it uses 10 GiByte+ of file system
space, as I have things set up, 17 GiByte+. So, if this
ends u pin tmpfs space, RAM+SWAP has to cover that large
file system space. Only 2 simple USE_TMPFS= settings avoid
this:

USE_TMPFS=data
USE_TMPFS=no

There is also the following pair that can otherwise be used
to avoid such for just specific ports when using other
USE_TMPFS= settings. For example:

TMPFS_BLACKLIST="rust"
TMPFS_BLACKLIST_TMPDIR=${BASEFS}/data/cache/tmp

(Of course, the file system for TMPFS_BLACKLIST_TMPDIR
needs to have sufficient space available for whatever
combination of blacklisted ports might be building
at the same time.)

(Note: my familiarity is with poudriere-devel .)

(RAM+SWAP use for processes memory for building the likes
of rust are smaller than such tmpfs use but are still
notable.)

(There are other relevant system settings involved for
OOM should the above sort of thing prove insufficient.
But I'll stop with the above for now.)

> I tied to setup PACKAGE_FETCH with the two following variables in 
> poudriere.conf
> PACKAGE_FETCH_BRANCH=quarterly
> PACKAGE_FETCH_WHITELIST="gcc* rust llvm* gcc90 gcc10 gcc11 gcc12 
> gcc13 gcc14 llvm10 llvm11 llvm12 llvm13 llvm14 llvm15 lua54"
> 
> When I started usual "poudriere bulk" only gcc12 were fetched, llvm10 
> and rust built from sources:
> 
> [00:01:02] Package fetch: Will fetch 1 packages from remote or local 
> pkg cache
> The following packages will be fetched:
> New packages to be FETCHED:
> gcc12: 12.2.0_5 (81 MiB: 100.00% of the 81 MiB to download)
> Number of packages to be fetched: 1
> 
> I thought maybe I have some different options selected for llvm and rust 
> than the default on official FreeBSD packages, I double checked, removed 
> stored options and started another poudriere bulk with different package 
> set (llvm10 and rust will be needed for the set).
> 
> This time the rust package was downloaded, but llvm10 built from source 
> again:
> 
> [00:00:22] Package fetch: Will fetch 1 packages from remote or local 
> pkg cache
> The following packages will be fetched:
> New packages to be FETCHED:
> rust: 1.66.0 (112 MiB: 100.00% of the 112 MiB to download)
> Number of packages to be fetched: 1
> The process will require 112 MiB more space.
> 112 MiB to be downloaded.
> [xxxxx] Fetching rust-1.66.0.pkg: 100% 112 MiB 39.2MB/s 00:03
> 
> But the mystery is that "poudriere bulk" failed on building rust even if 
> it should be used from fetched package:

A possibility for the type of issue:

Using 1.66 vs. 1.67 as an example, there was a time frame when
the most recent package available to download was 1.66 based
but the port had been updated to 1.67 . The package for 1.67
showed up later after the FreeBSD build-server poudriere
bulk activity and the distribution to the download server
that you happen to (potentially) use.

So it might have downloaded 1.66 but discovered that the ports
tree involved was at 1.67 instead.

Unless port updates were delayed until the package is also
available so they can be published as a unit, there is no
way to avoid such mismatches. Attempts to build before the
new version of the package is available leads to doing the
build locally in order to match the ports tree involved.

This is more of a problem for ports updated frequently.

There is also that the packages for distribution are updated
after the FreeBSD build server's poudriere bulk that involved
the package: the packages are updated more like "as a unit"
relative to the overall bulk build but distribution is spread
over time and space.

> [04:11:44] Failed ports: lang/rust:build
> [04:11:44] Skipped ports: devel/cargo-c graphics/libimagequant 
> graphics/py-pillow@py37
> 
> I checked the logs and the rust build process was killed by OOM killer. 
> (but why poudriere was building it if it was already fetched?)
> 
> I started poudriere bulk again, rust was fetch again a this time it was 
> really used to build other packages, no rebuild of rust from sources needed.

In my illustration above, this could have been that the package
for 1.67 had become available and was downloaded.

> I am really confused why Poudriere fetches only 1 package at a time if 
> it should fetch gcc12, rust and llvm10?

My notes are not directly about the above issue.

> Why Poudriere tried to build rust if it fetches it as pkg?

I've tried to indicate a possibility for some examples of this.

> And why it does not fetch llvm10 even if it is available and we do not 
> have options stored for devel_llvm10?


Using a more overall example context: Using poudriere, it
rebuilds a port that are dependent on ports that have new
versions, even if the port in question does not have a
new version number. After rebuilding, the lack of a new
version leads to package-update activity not installing
an update. So: built but not put to use.

This gets to be sort of like the earlier 1.66 vs. 1.67
illustration: The downloadable package might have been
based on an earlier version of some other port(s) and,
until rebuilt and distributed, is somewhat mismatched
with some port(s) it was dependent on. This could lead
to a rebuild, even if the rebuild might end up not being
used because of a lack of version number change.


===
Mark Millard
marklmi at yahoo.com