Re: git: 4a864f624a70 - main - vm_pageout: Print a more accurate message to the console before an OOM kill

From: Mark Johnston <markj_at_freebsd.org>
Date: Sat, 15 Jan 2022 15:55:18 UTC
On Fri, Jan 14, 2022 at 09:38:56PM -0800, Mark Millard wrote:
> Thanks. This will allow me to remove part of my personal additions
> in this area --and my having to explain the misnomer when trying
> to help someone analyze why they end up with OOM activity so they
> can figure out what to do about it.
> 
> There seem to be two separate sources of VM_OOM_SWAPZ. Showing
> my personal additions for them (just making them explicit in the
> sequence of messages generated):
> 
> diff --git a/sys/vm/swap_pager.c b/sys/vm/swap_pager.c
> index 01cf9233329f..280621ca51be 100644
> --- a/sys/vm/swap_pager.c
> +++ b/sys/vm/swap_pager.c
> @@ -2091,6 +2091,7 @@ swp_pager_meta_build(vm_object_t object, vm_pindex_t pindex, daddr_t swapblk)
>                                     0, 1))
>                                         printf("swap blk zone exhausted, "
>                                             "increase kern.maxswzone\n");
> +                               printf("swp_pager_meta_build: swap blk uma zone exhausted\n");
>                                 vm_pageout_oom(VM_OOM_SWAPZ);
>                                 pause("swzonxb", 10);
>                         } else
> @@ -2121,6 +2122,7 @@ swp_pager_meta_build(vm_object_t object, vm_pindex_t pindex, daddr_t swapblk)
>                                     0, 1))
>                                         printf("swap pctrie zone exhausted, "
>                                             "increase kern.maxswzone\n");
> +                               printf("swp_pager_meta_build: swap pctrie uma zone exhausted\n");
>                                 vm_pageout_oom(VM_OOM_SWAPZ);
>                                 pause("swzonxp", 10);
>                         } else
> 
> Care to comment on the distinctions and why there are two
> contexts classified as "out of swap space"? Would either
> one show the swap space as (nearly?) all used in, say, top?
> Or might one of them still end up looking like a misnomer
> from just a top (or whatever) display?

Hmm, those cases should likely be changed from "out of swap space" to
"failed to allocate swap metadata" or something like that.  Running out
of swap space is not itself a reason to trigger an OOM kill; if the page
daemon can continue to reclaim clean pages while swap is full, then
it'll do so without killing anything.  If the swap devices are full and
the only way to reclaim memory is by laundering dirty pages, then
"failed to reclaim memory" is the message you'd likely see after this
commit.

The two cases which call vm_pageout_oom(VM_OOM_SWAPSZ) arise when the
swap pager fails to allocate structures used to map physical pages to
their location on a swap device.  swap_pager_swap_init() pre-allocates
these structures during boot, and the size of the reserves is based on
the amount of physical memory.  In particular, each VM object maintains
a trie of "swap blocks", each of which maps a run of SWAP_META_PAGES
pages contiguous within an object to individual blocks on a swap device.
One zone provides internal nodes for the trie, while the other provides
these swap blocks.  Assuming perfect efficiency, the reserves provide
enough memory to allow all of physical memory to be swapped out, I
believe.  In practice there can be external fragmentation of the page
index space which leads to less than perfect utilization of these
metadata structures, in which case it's possible to exhaust the
reserves.  This seems to be a fairly rare scenario though.