OOM-killer can't work on FreeBSD 11.0

Mon Sep 11 04:04:53 UTC 2017

Hi,

I have a mail system running FreeBSD 9.3 which is put on VMWare ESXi, it's assigned a low memory (1G or 2G) and a reasonable swap disk size (2 x Memory size).
The mail system was running for several years, and didn't see any freeze even a lot of mail traffic through it.

Recently I upgraded this mail system from FreeBSD 9.3 to FreeBSD 11.0, and after running a few days, the mail system got freeze. I can't get any response from the console,
and can't login to the mail system with SSH either, except ping to the system got response. I look into the message log and found a lot of messages:

swap_pager_getswapspace(3): failed
swap_pager_getswapspace(3): failed
swap_pager_getswapspace(4): failed
swap_pager_getswapspace(3): failed
swap_pager_getswapspace(3): failed
swap_pager_getswapspace(4): failed
swap_pager_getswapspace(3): failed
swap_pager_getswapspace(3): failed
swap_pager_getswapspace(3): failed
swap_pager_getswapspace(3): failed
swap_pager_getswapspace(3): failed
swap_pager_getswapspace(3): failed
swap_pager_getswapspace(3): failed
swap_pager_getswapspace(5): failed
swap_pager_getswapspace(3): failed
swap_pager_getswapspace(3): failed
swap_pager_getswapspace(4): failed
swap_pager: out of swap space
swap_pager_getswapspace(1): failed
swap_pager_getswapspace(16): failed
swap_pager_getswapspace(12): failed
swap_pager_getswapspace(9): failed
swap_pager_getswapspace(16): failed
...

It seems that the out of swap cause the system freeze.

To figure out this problem, restore the mail system to previous backup snapshot which is running on FreeBSD 9.3.
Put mail traffic pressure on the mail system, and observe the memory and swap space usage with a simple shell:

#!/bin/sh
while [ 1 ]; do
vmstat
pstat -s
sleep 60
done

>From the console, I saw the memory and swap space usage increased quickly. Once the swap space was eat out,
out of swap messages will be shown in message log:

swap_pager_getswapspace(4): failed
swap_pager_getswapspace(3): failed
swap_pager_getswapspace(3): failed
swap_pager_getswapspace(3): failed
swap_pager_getswapspace(4): failed
swap_pager_getswapspace(6): failed
swap_pager_getswapspace(2): failed
swap_pager_getswapspace(2): failed
swap_pager_getswapspace(2): failed
swap_pager_getswapspace(5): failed
swap_pager_getswapspace(8): failed
swap_pager_getswapspace(2): failed
swap_pager_getswapspace(4): failed
Sep 6 08:30:58 mail-system kernel: pid 92324 (bm_scanner), uid 5500, was killed: out of swap space

Compared to FreeBSD 11.0, there are still a lot of "swap_pager_getswapspace failed" messages, except FreeBSD 9.3 will kill a process to free memory.
This behavior cause the mail system can go on running, but FreeBSD 11.0 can't. Observe the system memory and swap space usage continuously,
the OOM-killer works accurately: once the swap space usage is 100%, the OOM-killer will be called to kill a process to free memory.

Dig into the source code of FreeBSD 9.3, file vm_pageout.c, function vm_pageout_scan():
                /*
                * If we are critically low on one of RAM or swap and low on
                * the other, kill the largest process.  However, we avoid
                * doing this on the first pass in order to give ourselves a
                * chance to flush out dirty vnode-backed pages and to allow
                * active pages to be moved to the inactive queue and reclaimed.
                */
                if (pass != 0 &&
                    ((swap_pager_avail < 64 && vm_page_count_min()) ||
                     (swap_pager_full && vm_paging_target() > 0)))
                                vm_pageout_oom(VM_OOM_MEM);

the corresponding source code in FreeBSD 11.0, file vm_pageout.c, function vm_pageout_scan():
        /*
         * If the inactive queue scan fails repeatedly to meet its
         * target, kill the largest process.
         */
        vm_pageout_mightbe_oom(vmd, page_shortage, starting_page_shortage);

The OOM-killer function vm_pageout_oom() is wrapped with function vm_pageout_mightbe_oom().

To know from which commit this behavior was changed, I search the FreeBSD SVN page and find a clue.
https://svnweb.freebsd.org/base?view=revision&revision=290920
In SVN commit r290920, a new sysctl node called vm.pageout_oom_seq was added to control the sensitivity of OOM-killer.
The default value of pageout_oom_seq is 12, the commit log said:
The number of passes to trigger OOM was selected empirically and
tested both on small (32M-64M i386 VM) and large (32G amd64)
configurations.

However, in my case, even vm.pageout_oom_seq is 12 by default, it didn't work as expected.
I doubt it's a bug, but I'm not pretty sure since I can't fully understand these codes.
I just want OOM-killer behaving on FreeBSD 11.0 like FreeBSD 9.3 does.
Is there anyone know how to solve it?

Thanks!