Belated out of swap kill on rpi3 at r359216

Mark Millard marklmi at yahoo.com
Tue Mar 24 17:09:58 UTC 2020



On 2020-Mar-24, at 08:57, bob prohaska <fbsd at www.zefox.net> wrote:

> An attempt to buildworld on an rpi3 running r359216 stopped with an
> OOMA kill. Sources were at 359264, loader.conf contained
> vm.pageout_oom_seq="4096" . 

What of the value of vm.pfault_oom_attempts ?

If vm.pfault_oom_attempts was not -1,
what of the value of vm.pfault_oom_wait ?

> A snipped of gstat log suggests the worst congestion in the storage I/O
> happened at Tue Mar 24 04:52:26 PDT 2020 with an L(q) of 37, but the
> OOMA kill happened at Tue Mar 24 04:53:04 PDT 2020, by which time the
> L(q) had dropped to one, half a minute later.
> 
> Is the delay in OOMA action to be expected? 
> 
> Here's the relevant part of the log, I hope the columns display readably:
> 
> 0/2/2/19177 mbuf clusters in use (current/cache/total/max)
> procs     memory       page                      disks     faults       cpu
> r b w     avm     fre  flt  re  pi  po    fr   sr mm0 da0   in   sy   cs us sy id
> 0  0 29 1897320   47312 12851   9   4   5 13330 1669   0   0 14172  8020  3034 65 29  6
> dT: 1.056s  w: 1.000s
> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
>   37    367    323   5463    6.4     45   1511   76.8      0      0    0.0   91.7  mmcsd0
>   37    367    323   5463    6.5     45   1511   76.9      0      0    0.0   93.3  mmcsd0s2
>   34    133    111   3209    7.6     22    697  134.7      0      0    0.0   74.0  mmcsd0s2a
>    3    235    212   2254    5.9     23    814   21.5      0      0    0.0   70.0  mmcsd0s2b
>   34    133    111   3209    7.6     22    697  134.7      0      0    0.0   74.2  ufs/rootfs
> Tue Mar 24 04:52:26 PDT 2020
> Device          1K-blocks     Used    Avail Capacity
> /dev/mmcsd0s2b    4404252   224484  4179768     5%
> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 21842, size: 12288
> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 25477, size: 4096

How bad were things back when the "indefinate wait buffer" notices
were generated (Mar 24 04:20:50 time frame)?

> 0/0/0/19177 mbuf clusters in use (current/cache/total/max)
> procs     memory       page                      disks     faults       cpu
> r b w     avm     fre  flt  re  pi  po    fr   sr mm0 da0   in   sy   cs us sy id
> 3  0 29 1936132   19232 12849   9   4   5 13328 1671   0   0 14173  8020  3036 65 29  6
> dT: 1.010s  w: 1.000s
> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
>    9    323    148   2531   20.7    175   3327   14.6      0      0    0.0   97.2  mmcsd0
>    9    323    148   2531   20.8    175   3327   14.6      0      0    0.0   98.4  mmcsd0s2
>    2     96     96   1869   16.9      0      0    0.0      0      0    0.0   82.9  mmcsd0s2a
>    7    227     51    661   28.3    175   3327   14.6      0      0    0.0   84.0  mmcsd0s2b
>    2     96     96   1869   16.9      0      0    0.0      0      0    0.0   83.0  ufs/rootfs
> Tue Mar 24 04:52:41 PDT 2020
> Device          1K-blocks     Used    Avail Capacity
> /dev/mmcsd0s2b    4404252   272928  4131324     6%
> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 21842, size: 12288
> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 25477, size: 4096
> 0/0/0/19177 mbuf clusters in use (current/cache/total/max)
> procs     memory       page                      disks     faults       cpu
> r b w     avm     fre  flt  re  pi  po    fr   sr mm0 da0   in   sy   cs us sy id
> 0  0 29 1939940   48016 12849   9   4   5 13328 1676 948   0 14174  8019  3037 65 29  6
> dT: 1.001s  w: 1.000s
> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
>    3    371    151   1989   91.0    221   3847   14.0      0      0    0.0  100.1  mmcsd0
>    3    371    151   1989   91.1    221   3847   14.0      0      0    0.0  100.1  mmcsd0s2
>    1     41     41    415   74.5      0      0    0.0      0      0    0.0  304.8  mmcsd0s2a
>    2    331    110   1574   97.3    221   3847   14.0      0      0    0.0   99.3  mmcsd0s2b
>    1     41     41    415   74.5      0      0    0.0      0      0    0.0  304.9  ufs/rootfs
> Tue Mar 24 04:52:45 PDT 2020
> Device          1K-blocks     Used    Avail Capacity
> /dev/mmcsd0s2b    4404252   282480  4121772     6%
> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 21842, size: 12288
> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 25477, size: 4096
> 0/2/2/19177 mbuf clusters in use (current/cache/total/max)
> procs     memory       page                      disks     faults       cpu
> r b w     avm     fre  flt  re  pi  po    fr   sr mm0 da0   in   sy   cs us sy id
> 2  0 29 1944652   42768 12849   9   4   5 13328 1678   0   0 14174  8019  3037 65 29  6
> dT: 1.010s  w: 1.000s
> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
>    2    498    498   6345    3.5      0      0    0.0      0      0    0.0   88.8  mmcsd0
>    2    498    498   6345    3.9      0      0    0.0      0      0    0.0   98.3  mmcsd0s2
>    1    152    152   1501    4.1      0      0    0.0      0      0    0.0   62.1  mmcsd0s2a
>    1    346    346   4844    3.9      0      0    0.0      0      0    0.0   88.8  mmcsd0s2b
>    1    152    152   1501    4.2      0      0    0.0      0      0    0.0   62.5  ufs/rootfs
> Tue Mar 24 04:52:47 PDT 2020
> Device          1K-blocks     Used    Avail Capacity
> /dev/mmcsd0s2b    4404252   282396  4121856     6%
> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 21842, size: 12288
> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 25477, size: 4096
> 0/0/0/19177 mbuf clusters in use (current/cache/total/max)
> procs     memory       page                      disks     faults       cpu
> r b w     avm     fre  flt  re  pi  po    fr   sr mm0 da0   in   sy   cs us sy id
> 5  0 28 1980200   19676 12848   9   4   5 13327 1684 714   0 14175  8019  3039 65 29  6
> dT: 1.002s  w: 1.000s
> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
>    9    487    174   1449   26.2    313   2427    9.1      0      0    0.0   98.4  mmcsd0
>    9    487    174   1449   26.4    313   2427    9.1      0      0    0.0  100.0  mmcsd0s2
>    1     45     45    267   22.9      0      0    0.0      0      0    0.0   80.3  mmcsd0s2a
>    9    442    129   1182   27.7    313   2427    9.1      0      0    0.0   97.2  mmcsd0s2b
>    1     45     45    267   22.9      0      0    0.0      0      0    0.0   80.4  ufs/rootfs
> Tue Mar 24 04:53:01 PDT 2020
> Device          1K-blocks     Used    Avail Capacity
> /dev/mmcsd0s2b    4404252   309996  4094256     7%
> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 21842, size: 12288
> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 25477, size: 4096
> 0/2/2/19177 mbuf clusters in use (current/cache/total/max)
> procs     memory       page                      disks     faults       cpu
> r b w     avm     fre  flt  re  pi  po    fr   sr mm0 da0   in   sy   cs us sy id
> 0  0 29 1983636   44400 12848   9   4   5 13326 1689   0   0 14175  8018  3040 65 29  6
> dT: 1.002s  w: 1.000s
> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
>   12    528      0      0    0.0    528   3797    7.5      0      0    0.0  100.0  mmcsd0
>   12    528      0      0    0.0    528   3797    7.5      0      0    0.0  100.0  mmcsd0s2
>   11    528      0      0    0.0    528   3797    7.5      0      0    0.0  100.0  mmcsd0s2b
> Tue Mar 24 04:53:05 PDT 2020
> Device          1K-blocks     Used    Avail Capacity
> /dev/mmcsd0s2b    4404252   322360  4081892     7%
> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 21842, size: 12288
> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 25477, size: 4096
> 3/769/772/19177 mbuf clusters in use (current/cache/total/max)
> procs     memory       page                      disks     faults       cpu
> r b w     avm     fre  flt  re  pi  po    fr   sr mm0 da0   in   sy   cs us sy id
> 1  0 29 1826272  165204 12848   9   4   5 13326 1689 971   0 14176  8018  3040 65 29  6
> dT: 1.001s  w: 1.000s
> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
>    1    614    604   4661    4.9     10    168   40.6      0      0    0.0   93.8  mmcsd0
>    1    614    604   4661    4.9     10    168   40.6      0      0    0.0   94.5  mmcsd0s2
>    0    210    209   1243    4.1      1     32  181.0      0      0    0.0   83.1  mmcsd0s2a
>    1    404    395   3418    5.4      9    136   25.1      0      0    0.0   81.0  mmcsd0s2b
>    0    210    209   1243    4.1      1     32  181.0      0      0    0.0   83.6  ufs/rootfs
> Tue Mar 24 04:53:08 PDT 2020
> Device          1K-blocks     Used    Avail Capacity
> /dev/mmcsd0s2b    4404252   230064  4174188     5%
> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 25477, size: 4096
> Mar 24 04:53:04 www kernel: pid 96708 (c++), jid 0, uid 0, was killed: out of swap space
> 0/772/772/19177 mbuf clusters in use (current/cache/total/max)
> procs     memory       page                      disks     faults       cpu
> r b w     avm     fre  flt  re  pi  po    fr   sr mm0 da0   in   sy   cs us sy id
> 2  0 28 1705884  221144 12848   9   4   5 13326 1689 1109   0 14176  8018  3041 65 29  6
> dT: 1.002s  w: 1.000s
> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
>    0    545    545   3580    2.1      0      0    0.0      0      0    0.0   74.0  mmcsd0
>    0    545    545   3580    2.1      0      0    0.0      0      0    0.0   75.7  mmcsd0s2
>    0    288    288   1517    2.0      0      0    0.0      0      0    0.0   56.4  mmcsd0s2a
>    0    256    256   2064    2.4      0      0    0.0      0      0    0.0   49.0  mmcsd0s2b
>    0    288    288   1517    2.0      0      0    0.0      0      0    0.0   57.1  ufs/rootfs
> Tue Mar 24 04:53:10 PDT 2020
> Device          1K-blocks     Used    Avail Capacity
> /dev/mmcsd0s2b    4404252   230064  4174188     5%
> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 25477, size: 4096
> Mar 24 04:53:04 www kernel: pid 96708 (c++), jid 0, uid 0, was killed: out of swap space
> 

Below I show code changes to be more explicit in the
output messaging about what contributes to initiating
OOM kills, without needing boot verbose or the like.
There are also some messages from Mark J.'s old code
for reporting on things related to initiating OOM
kills, or some minor variations of them.

You may want to try such changes to provide more
context for your OOM kills when they happen. Below
the 4 reasons reported are (showing the most
detailed of the related messages from different
stages):

swp_pager_meta_build: swap blk uma zone exhausted

swp_pager_meta_build: swap pctrie uma zone exhausted

vm_fault_allocate: proc %d (%s) failed to alloc page on fault, starting OOM

m_pageout_mightbe_oom: kill context: v_free_count: %u, v_inactive_count: %u


The below is based on my head -r358966 context for
the detailed diff's.

# svnlite diff /usr/src/sys/vm/ | more
Index: /usr/src/sys/vm/swap_pager.c
===================================================================
--- /usr/src/sys/vm/swap_pager.c        (revision 358966)
+++ /usr/src/sys/vm/swap_pager.c        (working copy)
@@ -1998,6 +1998,7 @@
                                    0, 1))
                                        printf("swap blk zone exhausted, "
                                            "increase kern.maxswzone\n");
+                               printf("swp_pager_meta_build: swap blk uma zone exhausted\n");
                                vm_pageout_oom(VM_OOM_SWAPZ);
                                pause("swzonxb", 10);
                        } else
@@ -2028,6 +2029,7 @@
                                    0, 1))
                                        printf("swap pctrie zone exhausted, "
                                            "increase kern.maxswzone\n");
+                               printf("swp_pager_meta_build: swap pctrie uma zone exhausted\n");
                                vm_pageout_oom(VM_OOM_SWAPZ);
                                pause("swzonxp", 10);
                        } else
Index: /usr/src/sys/vm/vm_fault.c
===================================================================
--- /usr/src/sys/vm/vm_fault.c  (revision 358966)
+++ /usr/src/sys/vm/vm_fault.c  (working copy)
@@ -1074,9 +1074,8 @@
                        fs->oom++;
                        vm_waitpfault(dset, vm_pfault_oom_wait * hz);
                } else  {
-                       if (bootverbose)
-                               printf(
-               "proc %d (%s) failed to alloc page on fault, starting OOM\n",
+                       printf(
+               "vm_fault_allocate: proc %d (%s) failed to alloc page on fault, starting OOM\n",
                                    curproc->p_pid, curproc->p_comm);
                        vm_pageout_oom(VM_OOM_MEM_PF);
                        fs->oom = 0;
Index: /usr/src/sys/vm/vm_page.c
===================================================================
--- /usr/src/sys/vm/vm_page.c   (revision 358966)
+++ /usr/src/sys/vm/vm_page.c   (working copy)
@@ -3134,6 +3134,7 @@
         * race-free vm_wait_domain().
         */
        if (curproc == pageproc) {
+               printf("thread %d waiting for memory\n", curthread->td_tid);
                mtx_lock(&vm_domainset_lock);
                vm_pageproc_waiters++;
                msleep(&vm_pageproc_waiters, &vm_domainset_lock, PVM | PDROP,
Index: /usr/src/sys/vm/vm_pageout.c
===================================================================
--- /usr/src/sys/vm/vm_pageout.c        (revision 358966)
+++ /usr/src/sys/vm/vm_pageout.c        (working copy)
@@ -1741,6 +1741,8 @@
         * start OOM.  Initiate the selection and signaling of the
         * victim.
         */
+       printf("vm_pageout_mightbe_oom: kill context: v_free_count: %u, v_inactive_count: %u\n",
+           vmd->vmd_free_count, vmd->vmd_pagequeues[PQ_INACTIVE].pq_cnt);
        vm_pageout_oom(VM_OOM_MEM);
 
        /*
@@ -1933,10 +1935,24 @@
        }
        sx_sunlock(&allproc_lock);
        if (bigproc != NULL) {
+               char *reason_text;
+               switch (shortage) {
+               case VM_OOM_MEM_PF:
+                       reason_text= "fault's page allocation failed";
+                       break;
+               case VM_OOM_MEM:
+                       reason_text= "free RAM stayed below threshold";
+                       break;
+               case VM_OOM_SWAPZ:
+                       reason_text= "swblk or swpctrie zone exhausted";
+                       break;
+               default:
+                       reason_text= "Unknown Reason";
+               }
                if (vm_panic_on_oom != 0 && --vm_panic_on_oom == 0)
-                       panic("out of swap space");
+                       panic("%s",reason_text);
                PROC_LOCK(bigproc);
-               killproc(bigproc, "out of swap space");
+               killproc(bigproc, reason_text);
                sched_nice(bigproc, PRIO_MIN);
                _PRELE(bigproc);
                PROC_UNLOCK(bigproc);



===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)



More information about the freebsd-arm mailing list