Re: git: 4a864f624a70 - main - vm_pageout: Print a more accurate message to the console before an OOM kill [MFC in time for 13.1?]

From: Mark Millard <marklmi_at_yahoo.com>
Date: Mon, 28 Feb 2022 19:04:28 UTC

On 2022-Feb-26, at 17:10, Mark Millard <marklmi@yahoo.com> wrote:

> On 2022-Jan-15, at 07:55, Mark Johnston <markj@FreeBSD.org> wrote:
> 
>> On Fri, Jan 14, 2022 at 09:38:56PM -0800, Mark Millard wrote:
>>> Thanks. This will allow me to remove part of my personal additions
>>> in this area --and my having to explain the misnomer when trying
>>> to help someone analyze why they end up with OOM activity so they
>>> can figure out what to do about it.
>>> 
>>> There seem to be two separate sources of VM_OOM_SWAPZ. Showing
>>> my personal additions for them (just making them explicit in the
>>> sequence of messages generated):
>>> 
>>> diff --git a/sys/vm/swap_pager.c b/sys/vm/swap_pager.c
>>> index 01cf9233329f..280621ca51be 100644
>>> --- a/sys/vm/swap_pager.c
>>> +++ b/sys/vm/swap_pager.c
>>> @@ -2091,6 +2091,7 @@ swp_pager_meta_build(vm_object_t object, vm_pindex_t pindex, daddr_t swapblk)
>>>                                   0, 1))
>>>                                       printf("swap blk zone exhausted, "
>>>                                           "increase kern.maxswzone\n");
>>> +                               printf("swp_pager_meta_build: swap blk uma zone exhausted\n");
>>>                               vm_pageout_oom(VM_OOM_SWAPZ);
>>>                               pause("swzonxb", 10);
>>>                       } else
>>> @@ -2121,6 +2122,7 @@ swp_pager_meta_build(vm_object_t object, vm_pindex_t pindex, daddr_t swapblk)
>>>                                   0, 1))
>>>                                       printf("swap pctrie zone exhausted, "
>>>                                           "increase kern.maxswzone\n");
>>> +                               printf("swp_pager_meta_build: swap pctrie uma zone exhausted\n");
>>>                               vm_pageout_oom(VM_OOM_SWAPZ);
>>>                               pause("swzonxp", 10);
>>>                       } else
>>> 
>>> Care to comment on the distinctions and why there are two
>>> contexts classified as "out of swap space"? Would either
>>> one show the swap space as (nearly?) all used in, say, top?
>>> Or might one of them still end up looking like a misnomer
>>> from just a top (or whatever) display?
>> 
>> Hmm, those cases should likely be changed from "out of swap space" to
>> "failed to allocate swap metadata" or something like that.
> 
> The above does not seem to have happened yet in main [so: 14].
> 
> Will 13.1 get an MFC of 4a864f624a70 in time, possibly with the
> above change also in place to fully avoid misnomer reporting
> that misleads folks?
> 
> 4a864f624a70 listed:
> 
> MFC after:	2 weeks
> 
> but it has been more than a month.
> 
>> . . .
>> 
> 

Thanks for the stable/13 MFC as 13ba1d283676. It
provides a big improvement over the prior messaging
for the OOM kills.



For reference, I do still view:

+		case VM_OOM_SWAPZ:
+			reason = "out of swap space";
+			break;

as using a confusing misnomer ("swap space") for its
message. But, so far as I know, VM_OOM_SWAPZ is a
rarity and  possibly very difficult to produce. If
so, any confusions from the message should also be
rare.

===
Mark Millard
marklmi at yahoo.com