Re: "failed to reclaim memory" with much free physmem

From: Rick Macklem <rick.macklem_at_gmail.com>
Date: Sat, 13 Sep 2025 02:25:25 UTC
On Fri, Sep 12, 2025 at 7:09 PM Rick Macklem <rick.macklem@gmail.com> wrote:
>
> On Fri, Sep 12, 2025 at 6:35 PM Garrett Wollman <wollman@bimajority.org> wrote:
> >
> > <<On Fri, 12 Sep 2025 18:29:30 -0700, Rick Macklem <rick.macklem@gmail.com> said:
> >
> > > Lets see, 50% of memory allocated to mbufs and 99.9%
> > > of physical memory allowed for the arc.
> > > - This reminds me of the stats CNN puts up, where the
> > >   percentages never add up to 100.
> >
> > The point being that the ARC is supposed to respond to backpressure
> > long before memory runs out.
The problem is that it must react quickly and aggressively enough.

If you've ever studied queueing theory, you know that it is difficult to
impossible to stabilize a system without feedback. For NFS the feedback is
the replies to RPCs that throttle the clients. However, throw in a bunch of
clients and large TCP send windows and the feedback doesn't happen that
quickly.

If I were trying to fix this, I'd start by either:
- setting vfs.zfs.arc_max to a much smaller value than 99.9% and see
  if that stabilizes the server. If I was lucky and it did, I'd slowly increase
  the value and then cut it down by a fair amount after I saw the first failure.
  (I might also be tempted to decrease kern.ipc.maxmbufmem.)
OR
- I'd take a good look at the old FreeBSD 13.n code and see how it
  adjusted the arc and then try and make the new code do the same
  thing. (I noted that there is a lot more code in the Linux port than
  the FreeBSD port of the current ZFS code, found in os/<name>/zfs/arc_os.c.)

If I had a setup where I could test/play with this, I think it would be
kinda fun, but I doubt something done on a 4Gbyte laptop is going
to produce similar results, especially when I really only have one NFS
client to generate load against it.

Good luck with whatever you try, rick

> >  And again, we're talking about a system
> > with 100 GiB of outright FREE physical memory.  There's no possible
> > way that can be fully allocated in less than 5 minutes -- the NICs
> > aren't that fast and the servers aren't doing anything else.
> I don't recall you mentioning your NIC speed, but 10Gbps->about 1Gbyte/sec.
> That's 100sec. But you certainly could be correct.
>
> rick
>
> >
> > -GAWollman
> >