Re: Resolved: devel/llvm13 build: "ninja: build stopped: subcommand failed"

From: Nuno Teixeira <eduardo_at_freebsd.org>
Date: Sun, 14 Aug 2022 17:15:28 UTC
I use ZFS.

I will follow your recomendations and use a swap of 64GB and then test it
again.

In the meanwhile I will take a look at freebsd docs to see how do I
increase swap, by adding a new swap file or resize actual one if possible.

Mark Millard <marklmi@yahoo.com> escreveu no dia domingo, 14/08/2022 à(s)
17:35:

> On 2022-Aug-14, at 07:50, Nuno Teixeira <eduardo@freebsd.org> wrote:
>
> Hello Mark,
>
> > I use poudriere with USE_TMPFS=no, ofc because of low mem)
> > The problem "ninja: build stopped: subcommand failed"
>
> That is never the original error, just ninja reporting after
> it observed an error that occurred, generally in another
> process that is involved. A wide variety of errors will
> end up with a "ninja: build stopped: subcommand failed"
> notice as well.
>
> The original error should be earlier in the log or on the
> console ( or in /var/log/messages ). The "was killed: failed
> to reclaim memory" is an example.
>
> With 16 GiBytes of RAM you could have up to something like
> 60 GiByte of swap without FreeBSD complaining about being
> potentially mistuned. (It would complain before 64 GiBytes
> of SWAP.) 16+60 would be 76 GiBytes for RAM+SWAP.
>
> I forgot to ask about UFS vs. ZFS being in use: which is in
> use? (ZFS uses more RAM.)
>
> > have some time now and it's caused by a build peak of memory that
> affects people with less than 32/64GB mem and to solve building it must be
> build using one builder with one core thats takes about 7 hours on my
> machine or with 6c+6t on 12.3 i386 that takes about 45min (123i386 is the
> only jail that I can use all cores).
>
> Last I tried I built all the various devel/llvm* on a 8 GiByte
> RPi4B, 4 builders active and ALLOW_MAKE_JOBS=yes in use.
> 4 FreeBSD cpus. So the load average would have been around 16+
> much of the time during devel/llvm13 's builder activity.
> USE_TMPFS=data in use.
>
> Similarly for a 16 GiByte machine --but it is also an aarch64
> context, also 4 FreebSD cpus.
>
> But I use in /boot/loader.conf:
>
> #
> # Delay when persistent low free RAM leads to
> # Out Of Memory killing of processes:
> vm.pageout_oom_seq=120
>
> This has been historically important to avoiding the likes of
> "was killed: failed to reclaim memory" and related notices on
> various armv7 and aarch64 small board computers used to
> buildworld buildkernel and build ports, using all the cores.
>
> The only amd64 system that I've access to has 32 FreeBSD cpus
> and 128 GiBytes of RAM. Not a good basis for a comparison test
> with your context. I've no i386 access at all.
>
> > llvm 12 build without problems
>
> Hmm. I'll try building devel/llvm13 on aarch64 with periodic
> sampling of the memory use to see maximum observed figures
> for SWAP and for various categories of RAM, as well as the
> largest observed load averages.
>
> ZFS context use. I could try UFS as well.
>
> Swap: 30720Mi Total on the 8GiByte RPi4B.
> So about 38 GiBytes RAM+SWAP available.
> We should see how much SWAP is used.
>
> Before starting poudriere, shortly after a reboot:
>
> 19296Ki MaxObs(Act+Lndry+SwapUsed)
> (No SWAP in use at the time.)
>
> # poudriere bulk -jmain-CA72-bulk_a -w devel/llvm13
>
> for the from scratch build: reports:
>
> [00:00:34] Building 91 packages using up to 4 builders
>
> The ports tree is about a month back:
>
> # ~/fbsd-based-on-what-commit.sh -C /usr/ports/
> branch: main
> merge-base: 872199326a916efbb4bf13c97bc1af910ba1482e
> merge-base: CommitDate: 2022-07-14 01:26:04 +0000
> 872199326a91 (HEAD -> main, freebsd/main, freebsd/HEAD) devel/ruby-build:
> Update to 20220713
> n589512 (--first-parent --count for merge-base)
>
> But, if I gather right, the problem you see goes back
> before that.
>
> I can not tell how 4 FreeBSD cpus compares to the
> count that the Lenovo Legion 5 gets.
>
> I'll report on its maximum observed figures once the
> build stops. It will be a while before the RPi4B
> gets that far.
>
> The ports built prior to devel/llvm13's builder starting
> will lead to load averages over 4 from up to 4
> builders, each potentially using up to around 4
> processes. I'll see about starting a separate tracking
> once devel/llvm13 's builder has started if I happen
> to observe it at the right time frame for doing such.
>
> > Cheers
> >
> > Mark Millard <marklmi@yahoo.com> escreveu no dia domingo, 14/08/2022
> à(s) 03:54:
> > Nuno Teixeira <eduardo_at_freebsd.org> wrote on
> > Date: Sat, 13 Aug 2022 16:52:09 UTC :
> >
> > > . . .
> > > I've tested it but it still fails:
> > > ---
> > > pid 64502 (c++), jid 7, uid 65534, was killed: failed to reclaim memory
> > > swap_pager: out of swap space
> > > ---
> > > on a Lenovo Legion 5, 16GB RAM and 4GB swap.
> > > . . .
> >
> > This leaves various points unclear:
> >
> > poudriere style build? Some other style?
> >
> > (I'll state questions in a form generally for a poudriere style
> > context. Some could be converted to analogous points for other
> > build-styles.)
> >
> > How many poudriere builders allowed (-JN) ?
> >
> > /usr/local/etc/poudreire.conf :
> > ALLOW_MAKE_JOBS=yes in use?
> > ALLOW_MAKE_JOBS_PACKAGES=??? in use?
> > USE_TMPFS=??? With what value? Anything other that "data" or "no"?
> >
> > /usr/local/etc/poudriere.d/make.conf (or the like):
> > MAKE_JOBS_NUMBER=??? in use? With what value?
> >
> > Is tmpfs in use such that it will use RAM+SWAP when the
> > used tmpfs space is large?
> >
> > How much free space is available for /tmp ?
> >
> > Are you using something like ( in, say, /boot/loader/conf ):
>
> That should have been: /boot/loader.conf
>
> Sorry.
>
> > #
> > # Delay when persistent low free RAM leads to
> > # Out Of Memory killing of processes:
> > vm.pageout_oom_seq=120
> >
> >
> > How many FreeBSD cpus does a Lenovo Legion 5 present
> > in the configuration used?
> >
>
>
> ===
> Mark Millard
> marklmi at yahoo.com
>
>

-- 
Nuno Teixeira
FreeBSD Committer (ports)