Re: Unable to limit memory consumption with vfs.zfs.arc_max
Date: Mon, 15 Jul 2024 20:24:37 UTC
Picking up this old thread since it's still vexing me.... On Sat, May 04, 2024 at 07:56:39AM -0400, Dan Langille wrote: > > This is from FreeBSD 14 on an Dell R730 in the basement (primary purpose, poudriere, and PostgreSQL, and running four FreshPorts nodes): > > >From top: > > ARC: 34G Total, 14G MFU, 9963M MRU, 22M Anon, 1043M Header, 9268M Other > 18G Compressed, 41G Uncompressed, 2.28:1 Ratio > > % grep arc /boot/loader.conf > vfs.zfs.arc_max="36000M" > > Looks like the value to set is: > > % sysctl -a vfs.zfs.arc | grep max > vfs.zfs.arc.max: 37748736000 > > Perhaps not a good example, but this might be more appropriate: > > % grep vfs.zfs.arc.max /boot/loader.conf > vfs.zfs.arc_max="1200M" > > with top showing: > > ARC: 1198M Total, 664M MFU, 117M MRU, 3141K Anon, 36M Header, 371M Other > 550M Compressed, 1855M Uncompressed, 3.37:1 Ratio Thank you, Dan, I appreciate you chiming in. Unfortunately, I think I have those bases covered, although I'm open to anything I may have missed: # grep -i arc /boot/loader.conf /etc/sysctl.conf /boot/loader.conf:vfs.zfs.arc.max=4294967296 /boot/loader.conf:vfs.zfs.arc_max=4294967296 /boot/loader.conf:vfs.zfs.arc.min=2147483648 /etc/sysctl.conf:vfs.zfs.arc_max=4294967296 /etc/sysctl.conf:vfs.zfs.arc.max=4294967296 /etc/sysctl.conf:vfs.zfs.arc.min=2147483648 # top -b last pid: 16257; load averages: 0.80, 1.15, 1.18 up 0+02:03:34 12:05:06 55 processes: 2 running, 53 sleeping CPU: 11.7% user, 0.0% nice, 18.4% system, 0.1% interrupt, 69.9% idle Mem: 32M Active, 141M Inact, 11G Wired, 3958M Free ARC: 10G Total, 5143M MFU, 4679M MRU, 2304K Anon, 44M Header, 219M Other 421M Compressed, 4744M Uncompressed, 11.28:1 Ratio PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 11057 root 1 127 0 59M 33M CPU0 0 60:16 82.28% ssh 11056 root 5 24 0 22M 12M pipewr 3 6:00 6.25% zfs 1619 snmpd 1 20 0 34M 14M select 0 0:06 0.00% snmpd 1344 root 1 20 0 14M 3884K select 3 0:03 0.00% devd 1544 root 1 20 0 13M 2776K select 3 0:01 0.00% syslogd 1661 root 1 68 0 22M 9996K select 0 0:01 0.00% sshd 1587 ntpd 1 20 0 23M 5876K select 1 0:00 0.00% ntpd 14391 root 1 20 0 22M 11M select 3 0:00 0.00% sshd 2098 root 1 20 0 24M 11M select 1 0:00 0.00% httpd 1904 root 1 20 0 24M 11M select 2 0:00 0.00% httpd 1870 root 1 20 0 19M 8688K select 2 0:00 0.00% sendmail 2067 root 1 20 0 19M 8688K select 1 0:00 0.00% sendmail 2066 65529 1 20 0 13M 4564K select 2 0:00 0.00% mathlm 1883 65529 1 20 0 11M 2772K select 3 0:00 0.00% mathlm 14397 root 1 20 0 14M 4568K wait 1 0:00 0.00% bash 1636 root 1 20 0 13M 2608K nanslp 0 0:00 0.00% cron 2082 root 1 20 0 13M 2560K nanslp 3 0:00 0.00% cron 1887 root 1 20 0 13M 2568K nanslp 2 0:00 0.00% cron # sysctl -a | grep m.u_evictable kstat.zfs.misc.arcstats.mfu_evictable_metadata: 0 kstat.zfs.misc.arcstats.mfu_evictable_data: 0 kstat.zfs.misc.arcstats.mru_evictable_metadata: 0 kstat.zfs.misc.arcstats.mru_evictable_data: 0 An mrtg graph is attached showing ARC bytes used (kstat.zfs.misc.arcstats.size) in green, vs. ARC bytes max (vfs.zfs.arc.max) in blue. We can see that daily, the ARC bytes used blows right past the 4G limit. Most days, it is brought under control by two reboots in /etc/crontab ("shutdown -r now" at 02:55, 05:35), although some days the system is too far gone by the time the cron job rolls around, and the system stays hung until I can get to the data center and power cycle it. I'm not very skilled at kernel debugging, but is a kernel PR in order? This has happened with a GENERIC kernel across at least two builds of 14-STABLE: FreeBSD 14.0-STABLE #0 stable/14-n267062-77205dbc1397: Thu Mar 28 12:12:02 PDT 2024 FreeBSD 14.1-STABLE #0 stable/14-n267886-4987c12cb878: Thu Jun 6 12:24:06 PDT 2024 Would it help to reproduce this with a -RELEASE version? Thank you again, everyone. Jim