Re: Resolved: devel/llvm13 build: "ninja: build stopped: subcommand failed"

From: Mark Millard <marklmi_at_yahoo.com>
Date: Mon, 15 Aug 2022 13:49:20 UTC

On 2022-Aug-15, at 06:05, Nuno Teixeira <eduardo@freebsd.org> wrote:

> Hi Mark,
> 
> I use TMPFS=no

That is the wrong notation. Quoting /usr/local/etc/poudriere.conf.sample :

QUOTE
# Use tmpfs(5)
# This can be a space-separated list of options:
# wrkdir    - Use tmpfs(5) for port building WRKDIRPREFIX
# data      - Use tmpfs(5) for poudriere cache/temp build data
# localbase - Use tmpfs(5) for LOCALBASE (installing ports for packaging/testing)
# all       - Run the entire build in memory, including builder jails.
# yes       - Enables tmpfs(5) for wrkdir and data
# no        - Disable use of tmpfs(5)
# EXAMPLE: USE_TMPFS="wrkdir data"
USE_TMPFS=yes
END QUOTE

Note: involving wrkdir uses a lot of tmpfs for some
ports.

So a correct notation in /usr/local/etc/poudriere.conf
would be to replace the USE_TMPFS=yes assignment with:

USE_TMPFS=no

This might be why something is competing for RAM+SWAP
such that you run out: You might have wrkdir involved
via an implicit use of yes if you actualyl used the
notation TMPFS=no instead.

While RAM+SWAP use is high (but not yet out of space),
you can use a command like:

# df -m | egrep "^(Filesystem|tmpfs) "
Filesystem                                                             1M-blocks Used   Avail Capacity  Mounted on
tmpfs                                                                       1024   35     988     3%    /usr/local/poudriere/data/.m/main-CA72-bulk_a-default/ref/.p
tmpfs                                                                      31790    0   31790     0%    /usr/local/poudriere/data/.m/main-CA72-bulk_a-default/ref/var/db/ports
tmpfs                                                                       1024    0    1023     0%    /usr/local/poudriere/data/.m/main-CA72-bulk_a-default/01/.p

to get a clue how much tmpfs is in use to see if it
is large relative to your RAM+SWAP space.

> and I don't have WITH_DEBUG set.

Good.

> I will test with MAKE_JOBS_NUMBER=n in /usr/local/etc/poudriere.d/make.conf and see what max n I can get it to compile and observe used ram+swap.
> 
> Other possibility is to increase swap to <=60GB but I'd like to avoid that because I will need to resize freebsd-zfs partition.
> What you think about increasing swap?

As I mentioned before, you can add SWAP without modifying
your existing media by adding new media that has another
freebsd-swap partition and can set up to have the old and
new freebsd-swap partitions both be active at the same
time, thereby increasing the total SWAP. So, for example,
a USB3 NVMe SSD could be used to increase the SWAP space
available. (You may well have other reasons to not want
to do such a thing. But it is technically possible to do.)

But, hopefully, you can find and fix why something extra is
competing for your RAM+SWAP and so avoid needing more SWAP
space.

> Thanks
> 
> Mark Millard <marklmi@yahoo.com> escreveu no dia segunda, 15/08/2022 à(s) 04:26:
> On 2022-Aug-14, at 20:06, Tomoaki AOKI <junchoon@dec.sakura.ne.jp> wrote:
> 
> > On Sun, 14 Aug 2022 12:23:03 -0700
> > Mark Millard <marklmi@yahoo.com> wrote:
> > 
> >> On 2022-Aug-14, at 11:07, Nuno Teixeira <eduardo@freebsd.org> wrote:
> >> 
> >>> I will follow https://docs.freebsd.org/en/books/handbook/book/#disks-growing and resize actual swap, but before that I will have to make sure that backups are ok in case of something goes wrong.
> >>> 
> >>> I've tooked a note about total swap <=60GB
> >>> 
> >>> . . .
> >> 
> >> 
> >> I forgot an important question relative to the resource use
> >> for building devel/llvm13 and later: do you need/want the
> >> fortran compiler?
> >> 
> >> If not, you can disable the options: FLANG MLIR
> >> (building FLAG would require building MLIR)
> >> 
> >> Then the build will be far less memory intensive
> >> and take less time.
> >> 
> >> It had slipped my mind that my builds have these 2 options
> >> disabled.
> >> 
> >> ===
> >> Mark Millard
> >> marklmi at yahoo.com
> > 
> > Is there any possibility that something alike Bug 264949 [1] for gcc11
> > is happening?
> > 
> > For amd64, option GOLD (LTO support) is implied by default.
> > If it causes LTO to be enabled for llvm13 build itself, and lld act as
> > linker of gcc11, small TMPDIR would overflow.
> > 
> > gcc11 required at minimum 5GiB of free space on TMPDIR.
> > (4GiB was insufficient.)
> > 
> > [1] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=264949
> > 
> 
> Nuno had a specific problem others have not reported:
> 
> QUOTE
> I've tested it but it still fails:
> ---
> pid 64502 (c++), jid 7, uid 65534, was killed: failed to reclaim memory
> swap_pager: out of swap space
> ---
> on a Lenovo Legion 5, 16GB RAM and 4GB swap.
> END QUOTE
> 
> "was killed: failed to reclaim memory" and "swap_pager: out of swap
> space" are not about "small TMPDIR overflow", at least not directly.
> 
> But if the tmpfs is of a form backed by RAM+SWAP, then the tmpfs
> use can grow to contribute to causing reclaim problems and/or out of
> swap space problems by competing for RAM+SWAP space.
> 
> Nuno reported having disabled such tmpfs use ( via USE_TMPFS=no in
> /usr/local/etc/poudriere.conf ). I do not know if that status is
> well confirmed or not. I've also no clue if WITH_DEBUG= might be
> in use. (WITH_DEBUG needs to not be in use unless huge resources are
> available.)
> 
> My amd64 tests indicate that something beyond the compiles/links
> themselves was using significant RAM+SWAP in Nuno's context. I've
> no clue what.



===
Mark Millard
marklmi at yahoo.com