building LLVM threads gets killed

Mon Aug 20 14:58:22 UTC 2018

Rodney W. Grimes freebsd-rwg at pdx.rh.CN85.dnsmgr.net wrote on
Mon Aug 20 14:26:55 UTC 2018 :

> > On 20 Aug 2018, at 05:01, blubee blubeeme <gurenchan at gmail.com> wrote:
> > > 
> > > I am running current compiling LLVM60 and when it comes to linking
> > > basically all the processes on my computer gets killed; Chrome, Firefox and
> > > some of the LLVM threads as well
> > ...
> > > llvm/build % ninja -j8
> > > [2408/2473] Building CXX object
> > > lib/Passes/CMakeFiles/LLVMPasses.dir/PassBuilder.cpp.o
> > > FAILED: lib/Passes/CMakeFiles/LLVMPasses.dir/PassBuilder.cpp.o
> > > /usr/bin/c++  -DGTEST_HAS_RTTI=0 -D_DEBUG -D__STDC_CONSTANT_MACROS
> > > -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/Passes -I../lib/Passes
> > > -Iinclude -I../include -isystem /usr/local/include -fPIC
> > > -fvisibility-inlines-hidden -Werror=date-time
> > > -Werror=unguarded-availability-new -std=c++11 -Wall -Wextra
> > > -Wno-unused-parameter -Wwrite-strings -Wcast-qual
> > > -Wmissing-field-initializers -pedantic -Wno-long-long
> > > -Wcovered-switch-default -Wnon-virtual-dtor -Wdelete-non-virtual-dtor
> > > -Wstring-conversion -fdiagnostics-color -g    -fno-exceptions -fno-rtti -MD
> > > -MT lib/Passes/CMakeFiles/LLVMPasses.dir/PassBuilder.cpp.o -MF
> > > lib/Passes/CMakeFiles/LLVMPasses.dir/PassBuilder.cpp.o.d -o
> > > lib/Passes/CMakeFiles/LLVMPasses.dir/PassBuilder.cpp.o -c
> > > ../lib/Passes/PassBuilder.cpp
> > > c++: error: unable to execute command: Killed
> > 
> > It is running out of RAM while running multiple parallel link jobs.  If
> > you are building using WITH_DEBUG, turn that off, it consumes large
> > amounts of memory.  If you must have debug info, try adding the
> > following flag to the CMake command line:
> > 
> > -D LLVM_PARALLEL_LINK_JOBS:STRING="1"
> > 
> > That will limit the amount of parallel link jobs to 1, even if you
> > specify -j 8 to gmake or ninja.
> > 
> > Brooks, it would not be a bad idea to always use this CMake flag in the
> > llvm ports. :)
> 
> And this may also fix the issues that all the small
> memory (aka, RPI*) buliders are facing when trying
> to do -j4?

It may help, but:

Even just compiles with no links running can get the kills
in such small system contexts.

And going for a simpler context that can demonstrate
the behavior . . .

Taking a Pine64+ 2GB as an example (4 cores with 1
HW-thread per core, 2 GiBytes of RAM, USB device for
root file system and swap partition):

In another login:
# stress -d 2 -m 4 --vm-keep --vm-bytes 536870912

That "4" and "536870912" total to the 2 GiBytes so
swapping is induced for the context in question.
(Scale --vm-bytes appropriately to context.)
[Note: I had 3 GiBytes of swap space in a partition
for the below.]

[stress is from the port sysutils/stress .]

I had left the default vm.pageout_oom_seq=12 in place for this,
making the kills easier to get than the 120 figure would. It
does not take very long generally for some sort of message to
show up. Sometimes kills happen:

My test environment has Mark Johnston patches to report
things not normally reported:

waited 9s for async swap write
waited 9s for swap buffer
waited 9s for async swap write
waited 9s for async swap write
waited 9s for async swap write
v_free_count: 1357, v_inactive_count: 1
Aug 20 06:04:27 pine64 kernel: pid 1010 (stress), uid 0, was killed: out of swap space
waited 5s for async swap write
waited 5s for swap buffer
waited 5s for async swap write
waited 5s for async swap write
waited 5s for async swap write
waited 13s for async swap write
waited 12s for swap buffer
waited 13s for async swap write
waited 12s for async swap write
waited 12s for async swap write
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 161177, size: 131072
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 164766, size: 65536
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 164064, size: 12288
. . .

(I made multiple runs, most manually stopped. None run for
all that long.)

(Warning: the swap space part of "killed: out of swap space"
can be a misnomer. Killing is driven by having low free RAM
for sufficiently long. vm.pageout_oom_seq controls how
long. Swap space may be unused, little used, or actually be
low. With 3 GiBytes of swap space in the partition, these
runs were not low on swap space.)

Even with fairly stable ms/w figures the queue depths can
get to be large, implying long times before the last queued
write completes. For example (for a USB storage device for
the root file system and swap partition):

dT: 1.006s  w: 1.000s
L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
   0      0      0      0    0.0      0      0    0.0      0      0    0.0    0.0| mmcsd0
  56    312      0      0    0.0    312  19985  142.6      0      0    0.0   99.6| da0

Large quantities of bytes are being queued in a short time
relative to the around 20 MiByte/sec figure that can be
sustained. The quantity is spread over the queue entries.

Note: The Pine46+ 2GB builds devel/llvm60 in 14.5hr just fine
with vm.pageout_oom_seq=120 when using  all 4 cores. But it
has twice the RAM as a rpi3/rpi2 while having the same number
of cores. It simply does not have as much to write out as
often. (The same sort of point goes for buildworld buildkernel
sorts of activities.)

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)