Re: FYI; 14.3: A discord report of Wired Memory growing to 17 GiBytes over something like 60 days; ARC shrinks to, say, 1942 MiBytes
- Reply: Rick Macklem : "Re: FYI; 14.3: A discord report of Wired Memory growing to 17 GiBytes over something like 60 days; ARC shrinks to, say, 1942 MiBytes"
- Reply: Jan Martin Mikkelsen : "Re: FYI; 14.3: A discord report of Wired Memory growing to 17 GiBytes over something like 60 days; ARC shrinks to, say, 1942 MiBytes"
- In reply to: Konstantin Belousov : "Re: FYI; 14.3: A discord report of Wired Memory growing to 17 GiBytes over something like 60 days; ARC shrinks to, say, 1942 MiBytes"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 14 Aug 2025 04:35:19 UTC
On Wed, Aug 13, 2025 at 7:39 PM Konstantin Belousov <kostikbel@gmail.com> wrote: > > On Wed, Aug 13, 2025 at 05:42:55PM -0700, Rick Macklem wrote: > > On Wed, Aug 13, 2025 at 3:20 PM Rick Macklem <rick.macklem@gmail.com> wrote: > > > > > > On Wed, Aug 13, 2025 at 12:47 AM Darrin Smith <beldin@beldin.org> wrote: > > > > > > > > On Tue, 12 Aug 2025 12:57:39 +0300 > > > > Konstantin Belousov <kostikbel@gmail.com> wrote: > > > > > > > > > > > > > Start looking at differences in periodic shots of vmstat -z and > > > > > vmstat -m. It would not catch direct page allocators. > > > > > > > > Ok, I hope I'm reading these outputs correctlty... > > > > Looking at vmstat -z I am assuming the 'size' column shows the size of > > > > each malloc bucket and the used indicates the number of buckets used? > > > > (A quick look at vmstat.c pointing me to memstat_get_* suggests I'm on > > > > the right track) This results in numbers around the right order of > > > > magnitude to match my memory. > > > > > > > > I have noticed with 3 samples over the last 18 hours (in which time it > > > > looks like about 1/2 of my memory is now wired, which seems a little > > > > execessive, especially considering ZFS is only using about 6 1/2G > > > > accoding to top: > > > > > > > > Mem: 1568M Active, 12G Inact, 656M Laundry, 36G Wired, 994M Buf, 12G > > > > Free ARC: 6645M Total, 3099M MFU, 2617M MRU, 768K Anon, 49M Header, 877M > > > > 4995M Compressed, 5803M Uncompressed, 1.16:1 Ratio Swap: > > > > 8192M Total, 198M Used, 7993M Free, 2% Inuse > > > > > > > > In the middle of this rang I was building about 1000 packages in > > > > poudriere so it's been busy. > > > > > > > > Interestingly the ZFS ARC size has actually dropped since 9 hours ago > > > > when I took the 2nd measurement (was about 15G then) but that was at > > > > the height of the build and suggests the ARC is expiring older stuff > > > > happily. > > > > > > > > So assuming the used * size is correct I saw the following big changes > > > > in vmstat -z: > > > > > > > > vm_page: > > > > > > > > 18 hours ago (before build): 18159063040, 25473990656 > > > > > > > > 9 hours ago (during build) : 27994304512, 29363249152 > > > > delta : +9835241472, +3889258496 > > > > > > > > recent sample : 14337658880, 35773743104 > > > > delta : -13656645632, +6410493952 > > > > > > > > > > > > NAMEI: > > > > > > > > 18 hours ago: 2 267 478 016 > > > > > > > > 9 hours ago : 13 991 848 960 > > > > delta : +11 724 370 944 > > > > > > > > recent sample: 24 441 244 672 > > > > delta : +10 449 395 712 > > > > > > > > > > > > zfs_znode_cache: > > > > > > > > 18 hours ago: 370777296 > > > > > > > > 9 hours ago : 975800816 > > > > delta : +605023520 > > > > > > > > recent sample: 156404656 > > > > delta : -819396160 > > > > > > > > VNODE: > > > > > > > > 18 hours ago: 440384120 > > > > > > > > 9 hours ago : 952734200 > > > > delta : +512350080 > > > > > > > > recent sample: 159528160 > > > > delta : -793206040 > > > > > > > > Everything else comes out to smaller numbers, so I assume it's probably > > > > not them. > > > > > > > > If Im getting the numbers right I'm seeing various caches > > > > expiring after the poudriere build finished. But that NAMEI seems to be > > > > growing quite extensively still, don't know if that's expected or not :) > > > Are you running the nfsd? > > > > > > I ask because there might be a pretty basic blunder in the NFS server. > > > There several places where the NFS server code calls namei() and > > > they don't do a NDFREE_PNBUF() after the call. > > > All but one of them is related to the pNFS server, so it would not > > > affect anyone (no one uses it), but one of them is used to update the > > > V4 export list (a function called nfsrv_v4rootexport()). > > > > > > So Kostik, should there be a NDFREE_PNBUF() after a successful > > > namei() call to get rid of the buffer? > > So, I basically answered the question myself. After mjg@'s commit > > on Sep. 17, 2022 (5b5b7e2 in main), the buffer is always saved > > unless there is an error return. > YYes. > > > > > The "vmstat -z | fgrep NAMEI" count does increase by one each > > time I send a SIGHUP to mountd. > > This is fixed by adding a NDFREE_PNBUF(). > > > > However, one buffer each time exports are reloaded probably is > > not the leak you guys are looking for. > > Definitely. > > I am not sure what they reported (instead of raw output some > interpretation was provided), but so far it seems just the normal vnode > caching. Perhaps they can compare the number of vnode allocated against > the cap kern.maxvnodes. The allocation number should not exceed the > maxvnodes significantly. > Peter Eriksson posted this to me a little while ago... I wish I could upgrade our front-end servers from FreeBSD 13.5 btw - but there is a very troublesome issue with ZFS on FreeBSD 14+ - sometimes it runs amok and basically uses up all available RAM - and then the system load goes thru the roof and the machine basically grinds to a hold for _long_ periods - happens when we run our backup rsync jobs. https://github.com/openzfs/zfs/issues/17052 rick