RPI3 swap experiments ["was killed: out of swap space" with: "v_free_count: 5439, v_inactive_count: 1"]

Thu Aug 9 15:21:58 UTC 2018

On Wed, Aug 08, 2018 at 11:56:48PM -0700, bob prohaska wrote:
> On Wed, Aug 08, 2018 at 04:48:41PM -0400, Mark Johnston wrote:
> > On Wed, Aug 08, 2018 at 08:38:00AM -0700, bob prohaska wrote:
> > > The patched kernel ran longer than default but OOMA still halted buildworld around
> > > 13 MB. That's considerably farther than a default build world have run but less than
> > > observed when setting vm.pageout_oom_seq=120 alone. Log files are at
> > > http://www.zefox.net/~fbsd/rpi3/swaptests/r337226M/1gbsdflash_1gbusbflash/batchqueue/
> > > 
> > > Both changes are now in place and -j4 buildworld has been restarted. 
> > 
> > Looking through the gstat output, I'm seeing some pretty abysmal average
> > write latencies for da0, the flash drive.  I also realized that my
> > reference to r329882 lowering the pagedaemon sleep period was wrong -
> > things have been this way for much longer than that.  Moreover, as you
> > pointed out, bumping oom_seq to a much larger value wasn't quite
> > sufficient.
> > 
> > I'm curious as to what the worst case swap I/O latencies are in your
> > test, since the average latencies reported in your logs are high enough
> > to trigger OOM kills even with the increased oom_seq value.  When the
> > current test finishes, could you try repeating it with this patch
> > applied on top? https://people.freebsd.org/~markj/patches/slow_swap.diff
> > That is, keep the non-default oom_seq setting and modification to
> > VM_BATCHQUEUE_SIZE, and apply this patch on top.  It'll cause the kernel
> > to print messages to the console under certain conditions, so a log of
> > console output will be interesting.
> 
> The run finished with a panic, I've collected the logs and terminal output at
> http://www.zefox.net/~fbsd/rpi3/swaptests/r337226M/1gbsdflash_1gbusbflash/batchqueue/pageout120/slow_swap/
> 
> There seems to be a considerable discrepancy between the wait times reported
> by the patch and the wait times reported by gstat in the first couple of 
> occurrences. The fun begins at timestamp Wed Aug  8 21:26:03 PDT 2018 in
> swapscript.log. 

The reports of "waited for swap buffer" are especially bad: during those
periods, the laundry thread is blocked waiting for in-flight swap writes
to finish before sending any more.  Because the system is generally
quite starved for clean pages that it can reuse, it's relying on swap
I/O to clean more.  If that fails, the system eventually has no choice
but to start killing processes (where the time period corresponding to
"eventually" is determined by vm.pageout_oom_seq).

Based on these latencies, I think the system is behaving more or less as
expected from the VM's perspective.  I do think the default oom_seq value
is too low and will get that addressed in 12.0.