Re: "failed to reclaim memory" with much free physmem

From: Mark Millard <marklmi_at_yahoo.com>
Date: Fri, 12 Sep 2025 17:22:02 UTC
On Sep 12, 2025, at 08:23, Mark Millard <marklmi@yahoo.com> wrote:

> Garrett Wollman <wollman_at_bimajority.org>
> Date: Fri, 12 Sep 2025 04:05:34 UTC
> 
> . . .
> 
>> https://bimajority.org/%7Ewollman/memory-pinpoint%3D1756957462%2C1757648662.png
>> 
>> shows the memory utilization over the course of the past week
>> including the incident on Tuesday morning. I don't know why there's
>> 25G of inactive pages for three days leading up to the OOM; perhaps
>> that's related? Inactive is normally much less than 1G.
> 
> Is the growth to huge wired figures like 932.89G something
> new --or has such been historically normal?

At various stages, what does:

# sysctl vm | grep -e stats.free_ -e stats.vm.v_free_

show? In my current context, an example
output is (single domain context):

# sysctl vm | grep -e stats.free_ -e stats.vm.v_free_
vm.domain.0.stats.free_severe: 186376
vm.domain.0.stats.free_min: 308882
vm.domain.0.stats.free_reserved: 63871
vm.domain.0.stats.free_target: 1043915
vm.domain.0.stats.free_count: 41010336
vm.stats.vm.v_free_severe: 186376
vm.stats.vm.v_free_count: 41010331
vm.stats.vm.v_free_min: 308882
vm.stats.vm.v_free_target: 1043915
vm.stats.vm.v_free_reserved: 63871

It would not look as redundant for a multi-domain
context.


More detail about some of what would be output
is below.

There are the figures (shown for a non-NUMA context,
so only the 1 domain):

# sysctl -d vm.domain | grep "\.stats\.free_"
vm.domain.0.stats.free_severe: Severe free pages
vm.domain.0.stats.free_min: Minimum free pages
vm.domain.0.stats.free_reserved: Reserved free pages
vm.domain.0.stats.free_target: Target free pages
vm.domain.0.stats.free_count: Free pages

# sysctl vm.domain | grep "\.stats\.free_"
vm.domain.0.stats.free_severe: 186376
vm.domain.0.stats.free_min: 308882
vm.domain.0.stats.free_reserved: 63871
vm.domain.0.stats.free_target: 1043915
vm.domain.0.stats.free_count: 40923251

The domain's vmd_oom_seq value increments
when there is a shortage that has not
changed and:

vmd->vmd_free_count < vmd->vmd_pageout_wakeup_thresh

where:

vmd->vmd_pageout_wakeup_thresh = (vmd->vmd_free_target / 10) * 9

Or, in terms of the sysctl interface:

(vm.domain.?.stats.free_target / 10) * 9

(It is not explicitly published via sysctl from what
I saw.)

The domain's vmd_oom_seq value is compared to the
value reported by vm.pageout_oom_seq but there is
"voting" across all the domains for the overall oom
decision.

There are 2 figures that just ZFS uses:

/usr/main-src/sys/contrib/openzfs/module/os/freebsd/zfs/sysctl_os.c:	if (val < minfree)
/usr/main-src/sys/contrib/openzfs/module/os/freebsd/zfs/arc_os.c:	zfs_arc_free_target = vm_cnt.v_free_target;
/usr/main-src/sys/contrib/openzfs/include/os/freebsd/spl/sys/kmem.h:#define	minfree				vm_cnt.v_free_min

In sysctl terms these are in the list:

# sysctl -d vm.stats.vm | grep "\<v_free_"
vm.stats.vm.v_free_severe: Severe page depletion point
vm.stats.vm.v_free_count: Free pages
vm.stats.vm.v_free_min: Minimum low-free-pages threshold
vm.stats.vm.v_free_target: Pages desired free
vm.stats.vm.v_free_reserved: Pages reserved for deadlock

# sysctl vm.stats.vm | grep "\<v_free_"
vm.stats.vm.v_free_severe: 186376
vm.stats.vm.v_free_count: 40997647
vm.stats.vm.v_free_min: 308882
vm.stats.vm.v_free_target: 1043915
vm.stats.vm.v_free_reserved: 63871

These are overall, not per-NUMA-domain.

ZFS does not seem to do per-NUMA-domain memory
usage management: no interface used for such
information as far as I've seen.


===
Mark Millard
marklmi at yahoo.com