Re: FYI; 14.3: A discord report of Wired Memory growing to 17 GiBytes over something like 60 days; ARC shrinks to, say, 1942 MiBytes

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Thu, 14 Aug 2025 02:38:00 UTC
On Wed, Aug 13, 2025 at 05:42:55PM -0700, Rick Macklem wrote:
> On Wed, Aug 13, 2025 at 3:20 PM Rick Macklem <rick.macklem@gmail.com> wrote:
> >
> > On Wed, Aug 13, 2025 at 12:47 AM Darrin Smith <beldin@beldin.org> wrote:
> > >
> > > On Tue, 12 Aug 2025 12:57:39 +0300
> > > Konstantin Belousov <kostikbel@gmail.com> wrote:
> > >
> > >
> > > > Start looking at differences in periodic shots of vmstat -z and
> > > > vmstat -m. It would not catch direct page allocators.
> > >
> > > Ok, I hope I'm reading these outputs correctlty...
> > > Looking at vmstat -z I am assuming the 'size' column shows the size of
> > > each malloc bucket and the used indicates the number of buckets used?
> > > (A quick look at vmstat.c pointing me to memstat_get_* suggests I'm on
> > > the right track) This results in numbers around the right order of
> > > magnitude to match my memory.
> > >
> > > I have noticed with 3 samples over the last 18 hours (in which time it
> > > looks like about 1/2 of my memory is now wired, which seems a little
> > > execessive, especially considering ZFS is only using about 6 1/2G
> > > accoding to top:
> > >
> > > Mem: 1568M Active, 12G Inact, 656M Laundry, 36G Wired, 994M Buf, 12G
> > > Free ARC: 6645M Total, 3099M MFU, 2617M MRU, 768K Anon, 49M Header, 877M
> > >           4995M Compressed, 5803M Uncompressed, 1.16:1 Ratio Swap:
> > > 8192M Total, 198M Used, 7993M Free, 2% Inuse
> > >
> > > In the middle of this rang I was building about 1000 packages in
> > > poudriere so it's been busy.
> > >
> > > Interestingly the ZFS ARC size has actually dropped since 9 hours ago
> > > when I took the 2nd measurement (was about 15G then) but that was at
> > > the height of the build and suggests the ARC is expiring older stuff
> > > happily.
> > >
> > > So assuming the used * size is correct I saw the following big changes
> > > in vmstat -z:
> > >
> > > vm_page:
> > >
> > > 18 hours ago (before build): 18159063040, 25473990656
> > >
> > > 9 hours ago (during build) : 27994304512, 29363249152
> > > delta                      : +9835241472, +3889258496
> > >
> > > recent sample              : 14337658880, 35773743104
> > > delta                      : -13656645632, +6410493952
> > >
> > >
> > > NAMEI:
> > >
> > > 18 hours ago:     2 267 478 016
> > >
> > > 9 hours ago :    13 991 848 960
> > > delta       :   +11 724 370 944
> > >
> > > recent sample:   24 441 244 672
> > > delta        :  +10 449 395 712
> > >
> > >
> > > zfs_znode_cache:
> > >
> > > 18 hours ago: 370777296
> > >
> > > 9 hours ago : 975800816
> > > delta       : +605023520
> > >
> > > recent sample: 156404656
> > > delta        : -819396160
> > >
> > > VNODE:
> > >
> > > 18 hours ago: 440384120
> > >
> > > 9 hours ago : 952734200
> > > delta       : +512350080
> > >
> > > recent sample: 159528160
> > > delta        : -793206040
> > >
> > > Everything else comes out to smaller numbers, so I assume it's probably
> > > not them.
> > >
> > > If Im getting the numbers right I'm seeing various caches
> > > expiring after the poudriere build finished. But that NAMEI seems to be
> > > growing quite extensively still, don't know if that's expected or not :)
> > Are you running the nfsd?
> >
> > I ask because there might be a pretty basic blunder in the NFS server.
> > There several places where the NFS server code calls namei() and
> > they don't do a NDFREE_PNBUF() after the call.
> > All but one of them is related to the pNFS server, so it would not
> > affect anyone (no one uses it), but one of them is used to update the
> > V4 export list (a function called nfsrv_v4rootexport()).
> >
> > So Kostik, should there be a NDFREE_PNBUF() after a successful
> > namei() call to get rid of the buffer?
> So, I basically answered the question myself. After mjg@'s commit
> on Sep. 17, 2022 (5b5b7e2 in main), the buffer is always saved
> unless there is an error return.
YYes.

> 
> The "vmstat -z | fgrep NAMEI" count does increase by one each
> time I send a SIGHUP to mountd.
> This is fixed by adding a NDFREE_PNBUF().
> 
> However, one buffer each time exports are reloaded probably is
> not the leak you guys are looking for.

Definitely.

I am not sure what they reported (instead of raw output some
interpretation was provided), but so far it seems just the normal vnode
caching. Perhaps they can compare the number of vnode allocated against
the cap kern.maxvnodes. The allocation number should not exceed the
maxvnodes significantly.