Problems building rust with poudriere

Sun Nov 4 22:30:55 UTC 2018

Christian Stærk xi at borderworlds.dk wrote on
Sun Nov 4 13:45:01 UTC 2018 :

> For some time, I've had problems building rust with poudriere.
> 
> Poudriere log: 
> https://borderworlds.dk/~xi/rust-1.30.0.log.txt
> 
> 
> It looks like the system is running out of swap as I get this in 
> /var/log/messages:
> 
> Oct 30 05:15:31 xindi kernel: pid 68935 (rustc), uid 65534, was killed: 
> out of swap space

Unfortunately, the wording of this message is a misnomer for what
drives the kills: it is actually driven by being unable to gain more
free memory but FreeBSD will not swap-out processes that stay runnable
(or are running), only ones that are waiting. Even a single process
that stays runnable and keeps lots of RAM in the active category can
lead to kills when swap is unused or little used. So the kill-behavior
is very workload dependent.

Real "out of swap" conditions tend to also have messages
similar to:

Aug  5 17:54:01 sentinel kernel: swap_pager_getswapspace(32): failed

If you are not seeing such messages, then it is likely that
the mount of swap space still free is not the actual thing
driving the kills.

(The system simply leaves the dirty pages in memory when a
swap_pager_getswapspace failed message is produced. Of itself,
it does not cause a kill.)

Other questions:

Are you getting any I/O error reports for the device used for
swapping and paging?

Are you seeing any reports of: swap_pager: indefinite wait buffer ?

Poor I/O performance for paging and/or swapping can contribute
to the kills happening. But I've no clue if such is an issue
for your context.

> The build host had 6GB og RAM and 14GB of swap. I think that ought to be 
> enough for building mostly anything.

Again, without extra evidence, do not beleive the "out of swap space"
part of "killed: out of swap space".

> Has anyone else observed this behaviour?

Attempting -j4 buildworld on 1 GiBYte single-board-computers for
environments that build clang/llvm have lots of problems with this
sort of kill with plenty of swap space. There was a whole, long
exchange on the arm list during the discovery of the misnomer
status.

But it turns out there is a tunable setting to control how many
tries at freeing memory before kills happen: so how long before
the kills will start.

The default vm.pageout_oom_seq=12 can be increased
to increase how long a low-free-RAM condition is tolerated.
I assign vm.pageout_oom_seq in /etc/sysctl.conf --but that may
not be the best for your context.

vm.pageout_oom_seq=120 has proved useful. In some extreme
situations (buildworld buildkernel in a low RAM, slow
context, including long I/O latencies) vm.pageout_oom_seq=1024
or more has been used to avoid kills when there was plenty
of swap space.

So you may want to try assigning vm.pageout_oom_seq .

Side note:

Quoting Trev's reply (and the original question)
from a list exchange:

QUOTE
What does the error swap_pager: indefinite wait buffer: mean?

This means that a process is trying to page memory to disk, and the page attempt has hung trying to access the disk for more than 20 seconds. It might be caused by bad blocks on the disk drive, disk wiring, cables, or any other disk I/O-related hardware. If the drive itself is bad, disk errors will appear in /var/log/messages and in the output of dmesg. Otherwise, check the cables and connections.
ENDQUOTE

It is possible for a some systems to queue up more than the I/O
system can process in 20 seconds, even when the I/O is working
well (but is relatively slow compared to the work load).

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)