Re: [package - 130arm64-default][lang/gcc12-devel] Failed for gcc12-devel-12.0.1.s20220306_2 in build/runaway

From: Mark Millard <marklmi_at_yahoo.com>
Date: Sat, 26 Mar 2022 20:16:42 UTC
On 2022-Mar-26, at 12:35, Mark Millard <marklmi@yahoo.com> wrote:

> On 2022-Mar-26, at 07:26, Dimitry Andric <dim@FreeBSD.org> wrote:
> 
>> On 26 Mar 2022, at 15:16, pkg-fallout@freebsd.org <pkg-fallout@FreeBSD.org> wrote:
>>> 
>>> You are receiving this mail as a port that you maintain
>>> is failing to build on the FreeBSD package build server.
>>> Please investigate the failure and submit a PR to fix
>>> build.
>>> 
>>> Maintainer:     toolchain@FreeBSD.org
>>> Log URL:        http://ampere3.nyi.freebsd.org/data/130arm64-default/60ab72786154/logs/gcc12-devel-12.0.1.s20220306_2.log
>>> Build URL:      http://ampere3.nyi.freebsd.org/build.html?mastername=130arm64-default&build=60ab72786154
>> 
>> So there isn't any actual error message in this log, except at the end:
>> 
>> ...
>> =>> Cleaning up wrkdir
>> ===>  Cleaning for gcc12-devel-12.0.1.s20220306_2
>> Killed
>> build of lang/gcc12-devel | gcc12-devel-12.0.1.s20220306_2 ended at Sat Mar 26 14:16:58 UTC 2022
>> build time: 12:31:35
>> !!! build failure encountered !!!
>> 
>> It looks like the last command being run before "Killed" is the cc1plus
>> executable being linked with LTO, so I am assuming the build is killed
>> due to an out-of-memory condition?
>> 
>> But this is only visible to people that have access to the machine the
>> poudriere instance is running on. Can somebody with access please check?
>> 
> 
> I do not have access but I've started a poudriere build
> of my own on a HoneyComb. I've a patched top that monitors
> and reports various Maximum Observed (MaxObs????) figures,
> 64 GiBytes of RAM and slightly over 246 GiBytes of swap.
> So hopefully it will report on about how big the memory use
> gets. But it is allowed to use all 16 cores and there will
> be no competing bulk builds using resources. So not a match
> to the build server context.
> 
> Note: It is a ZFS context, so MaxObsWired is normally large
> and shrinks over times where memory needs to be used for
> other things. So the primary memory figures would be:
> 
> MaxObsSwapUsed (if any)
> MaxObsActive
> MaxObs(Act+Lndry+SwapUsed)
> 
> 
> Side Note:
> 
> http://ampere3.nyi.freebsd.org/build.html?mastername=130arm64-default&build=60ab72786154
> 
> reports a Time of 11:48:41 but the log reports "build time: 12:31:35".
> My guess is that processing the log file for extracting the type of
> error makes some (much?) of the difference. (Type being runaway_process
> in this case.)
> 
> 

I did just observe a cc1plus take somewhat over 30min
of CPU time before completing and the lto1 related activity
starting. It was under 5 GiBytes MaxObs(Act+Lndry+SwapUsed)
[No swap use observed] before the lto1 related activity
started.

For the lto1 related activity MaxObs(Act+Lndry+SwapUsed)
has, so far, gotten up to around 12 GiBytes, still
no swap use observed:

12079Mi MaxObsActive
12278Mi MaxObs(Act+Lndry+SwapUsed)

I'll note that:

last pid: . . .;  load averages:  . . . MaxObs:  28.02,  16.88,  15.82
. . . threads:   . . . running, . . . sleeping, 77 MaxObsRunning

So, on the timescale of the first load average, it does
not always stay limited to the hardware threads available.

No process with sustained CPU activity sticks around across
the lto1 activity. So I'll not be able to observe much about
cpu time.

The elasped time doing lto1 activity has been going for a
while but I'm unlikely to be able to observe its end happen.
So I'll likely not have a good clue about that.


===
Mark Millard
marklmi at yahoo.com