Re: "failed to reclaim memory" with much free physmem
- In reply to: Garrett Wollman : "RE: "failed to reclaim memory" with much free physmem"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Fri, 12 Sep 2025 00:22:10 UTC
On Thu, Sep 11, 2025 at 10:58 AM Garrett Wollman <wollman@bimajority.org> wrote: > > <<On Tue, 9 Sep 2025 12:19:21 -0700, Mark Millard <marklmi@yahoo.com> said: > > > Garrett Wollman <wollman_at_bimajority.org> wrote on > > Date: Tue, 09 Sep 2025 16:19:42 UTC : > > >> On some of our newer large-memory NFS servers, we are seeing services > >> killed with "failed to reclaim memory". According to our monitoring, > >> the server has >100G of physmem free at the time, > > > Was that 100G+ somewhat before any reclaiming of memory started, > > the lead-up to the notice? > > That was within five minutes of munin-node getting shot by the OOM > killer. There was much less memory free ca. 24 hours before the > event. > > > Any likelihood of sudden, rapid, huge drops in free RAM based on > > workload behavior? > > I don't have access to client workloads, but it would have to be a bug > in ZFS if so; these are file servers, all they run is NFS. Bug or tuning weakness? If you look at sys/contrib/openzfs/module/os/linux/zfs/arc_os.c, it does a bunch of arm-waving setting arc_sys_free whereas sys/contrib/openzfs/module/os/freebsd/zfs/arc_os.c doesn't do anything. --> I'd try tuning it via vfs.zfs.arc.sys_free? (The default is 0 and that says "use all of the memory" if I read it correctly. I probably haven't read it correctly, which was why I suggested you compare the two of them.) rick > > > Is NUMA involved? > > Damn if I know. > > >> and the only > >> solution seems to be rebooting. (There is a small amount of swap > >> configured and even less of it in use.) > > > That swap is in use at all could be of interest. I wonder > > whaat it was doing when the swap was put to use or laundry > > was growing that lead to swap being put to use. > > It's pretty normal on these servers, which stay up for six months > between OS upgrades, for some userland daemons to get swapped out, > although I agree that it seems like it shouldn't happen given that the > size of memory (1 TiB) is much greater than the size of running > processes (< 1 GiB). > > My suspicion here is that there's some sort of accounting error, but I > don't know where to look, and I only have data retrospectively, and > only the data that munin is collecting. (Someone else was on call > when this happened most recently and they reported that their login > shell kept on getting shot -- as was the getty on the serial console.) > > -GAWollman > >