RPI3 swap experiments ["was killed: out of swap space" with: "v_free_count: 5439, v_inactive_count: 1"]

Thu Aug 9 16:44:33 UTC 2018

On 2018-Aug-9, at 8:21 AM, Mark Johnston <markj at FreeBSD.org> wrote:

> On Wed, Aug 08, 2018 at 11:56:48PM -0700, bob prohaska wrote:
>> On Wed, Aug 08, 2018 at 04:48:41PM -0400, Mark Johnston wrote:
>>> On Wed, Aug 08, 2018 at 08:38:00AM -0700, bob prohaska wrote:
>>>> The patched kernel ran longer than default but OOMA still halted buildworld around
>>>> 13 MB. That's considerably farther than a default build world have run but less than
>>>> observed when setting vm.pageout_oom_seq=120 alone. Log files are at
>>>> http://www.zefox.net/~fbsd/rpi3/swaptests/r337226M/1gbsdflash_1gbusbflash/batchqueue/
>>>> 
>>>> Both changes are now in place and -j4 buildworld has been restarted. 
>>> 
>>> Looking through the gstat output, I'm seeing some pretty abysmal average
>>> write latencies for da0, the flash drive.  I also realized that my
>>> reference to r329882 lowering the pagedaemon sleep period was wrong -
>>> things have been this way for much longer than that.  Moreover, as you
>>> pointed out, bumping oom_seq to a much larger value wasn't quite
>>> sufficient.
>>> 
>>> I'm curious as to what the worst case swap I/O latencies are in your
>>> test, since the average latencies reported in your logs are high enough
>>> to trigger OOM kills even with the increased oom_seq value.  When the
>>> current test finishes, could you try repeating it with this patch
>>> applied on top? https://people.freebsd.org/~markj/patches/slow_swap.diff
>>> That is, keep the non-default oom_seq setting and modification to
>>> VM_BATCHQUEUE_SIZE, and apply this patch on top.  It'll cause the kernel
>>> to print messages to the console under certain conditions, so a log of
>>> console output will be interesting.
>> 
>> The run finished with a panic, I've collected the logs and terminal output at
>> http://www.zefox.net/~fbsd/rpi3/swaptests/r337226M/1gbsdflash_1gbusbflash/batchqueue/pageout120/slow_swap/
>> 
>> There seems to be a considerable discrepancy between the wait times reported
>> by the patch and the wait times reported by gstat in the first couple of 
>> occurrences. The fun begins at timestamp Wed Aug  8 21:26:03 PDT 2018 in
>> swapscript.log. 
> 
> The reports of "waited for swap buffer" are especially bad: during those
> periods, the laundry thread is blocked waiting for in-flight swap writes
> to finish before sending any more.  Because the system is generally
> quite starved for clean pages that it can reuse, it's relying on swap
> I/O to clean more.  If that fails, the system eventually has no choice
> but to start killing processes (where the time period corresponding to
> "eventually" is determined by vm.pageout_oom_seq).

Just for reference from 

http://www.zefox.net/~fbsd/rpi3/swaptests/r337226M/1gbsdflash_1gbusbflash/batchqueue/pageout120/slow_swap/swapscript.log

is some context for two of the swap buffer messages that gives some
information on the timing of the messages and some other activity:

Wed Aug  8 22:01:14 PDT 2018
Device          1K-blocks     Used    Avail Capacity
/dev/da0b         1048576    23912  1024664     2%
/dev/mmcsd0s3b    1048576    23892  1024684     2%
Total             2097152    47804  2049348     2%
Aug  8 21:36:13 www syslogd: last message repeated 1 times
Aug  8 21:36:13 www kernel: waited 3s for async swap write
dT: 10.004s  w: 10.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
    2     20      1      4   25.5     19   1216   41.4      0      0    0.0   29.3  mmcsd0
    3     29      1     11   10.8     28   1244   23.9      0      0    0.0   18.7  da0
    0      0      0      0    0.0      0      3   42.9      0      0    0.0    0.4  mmcsd0s2
    2     20      1      4   25.6     19   1213   41.5      0      0    0.0   29.4  mmcsd0s3
    0      0      0      0    0.0      0      3   43.0      0      0    0.0    0.4  ufs/rootfs
    2     20      1      4   25.6     19   1213   41.6      0      0    0.0   29.4  mmcsd0s3b
    2     20      1     11   10.9     19   1238   20.6      0      0    0.0   18.6  da0b
Wed Aug  8 22:01:24 PDT 2018
Device          1K-blocks     Used    Avail Capacity
/dev/da0b         1048576    37284  1011292     4%
/dev/mmcsd0s3b    1048576    36920  1011656     4%
Total             2097152    74204  2022948     4%
Aug  8 21:36:13 www syslogd: last message repeated 1 times
Aug  8 21:36:13 www kernel: waited 3s for async swap write
dT: 10.002s  w: 10.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
    0      8      0      0    4.7      8    693   41.7      0      0    0.0   12.2  mmcsd0
   13     13      0      0    0.0     13    682  132.7      0      0    0.0   78.3  da0
    0      8      0      0    4.7      8    693   41.8      0      0    0.0   12.2  mmcsd0s3
    0      8      0      0    4.8      8    693   41.9      0      0    0.0   12.3  mmcsd0s3b
    5      7      0      0    0.0      7    676   79.5      0      0    0.0   40.1  da0b
Wed Aug  8 22:01:34 PDT 2018
Device          1K-blocks     Used    Avail Capacity
/dev/da0b         1048576    44680  1003896     4%
/dev/mmcsd0s3b    1048576    44132  1004444     4%
Total             2097152    88812  2008340     4%
Aug  8 21:36:13 www kernel: waited 3s for async swap write
Aug  8 22:01:30 www syslogd: last message repeated 1 times
Aug  8 22:01:41 www kernel: waited 10s for async swap write
Aug  8 22:01:41 www kernel: waited 6s for swap buffer
Aug  8 22:01:41 www kernel: waited 13s for async swap write
Aug  8 22:01:41 www kernel: waited 3s for swap buffer
Aug  8 22:01:41 www kernel: waited 12s for async swap write
Aug  8 22:01:41 www kernel: waited 10s for async swap write
Aug  8 22:01:41 www kernel: waited 3s for async swap write
dT: 10.001s  w: 10.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
    0      2      1     26    3.4      1     17    3.8      0      0    0.0    0.9  mmcsd0
    0      4      3     61    3.8      2     70    5.7      0      0    0.0    1.7  da0
    0      2      1     26    3.5      1     17    3.9      0      0    0.0    0.9  mmcsd0s3
    0      1      0      0    0.0      1     17    4.0      0      0    0.0    0.4  mmcsd0s3a
    0      1      1     26    3.5      0      0    0.0      0      0    0.0    0.5  mmcsd0s3b
    0      2      2     54    3.8      0      0    0.0      0      0    0.0    0.9  da0b
    0      2      0      7    4.3      1     70    5.8      0      0    0.0    0.8  da0d
Wed Aug  8 22:01:51 PDT 2018
Device          1K-blocks     Used    Avail Capacity
/dev/da0b         1048576    46156  1002420     4%
/dev/mmcsd0s3b    1048576    45480  1003096     4%
Total             2097152    91636  2005516     4%
Aug  8 22:01:41 www kernel: waited 10s for async swap write
Aug  8 22:01:41 www kernel: waited 3s for async swap write
dT: 10.002s  w: 10.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
    0      8      7    251    3.4      0      8    2.6      0      0    0.0    2.5  mmcsd0
    0      9      7    186    4.2      2     77    7.4      0      0    0.0    3.6  da0
    0      8      7    251    3.5      0      8    2.6      0      0    0.0    2.6  mmcsd0s3
    0      0      0      0    0.0      0      8    2.7      0      0    0.0    0.1  mmcsd0s3a
    0      7      7    251    3.5      0      0    0.0      0      0    0.0    2.5  mmcsd0s3b
    0      7      7    184    4.2      0      0    0.0      0      0    0.0    2.7  da0b
    0      2      0      2    4.6      2     76    7.5      0      0    0.0    0.9  da0d

> Based on these latencies, I think the system is behaving more or less as
> expected from the VM's perspective.  I do think the default oom_seq value
> is too low and will get that addressed in 12.0.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)