Re: Unable to limit memory consumption with vfs.zfs.arc_max
Date: Mon, 15 Jul 2024 20:41:15 UTC
As Bugs Bunny often said, "What a maroon!"
Here's the attached MRTG graph.
On Mon, Jul 15, 2024 at 01:24:37PM -0700, Jim Long wrote:
> Picking up this old thread since it's still vexing me....
>
> On Sat, May 04, 2024 at 07:56:39AM -0400, Dan Langille wrote:
> >
> > This is from FreeBSD 14 on an Dell R730 in the basement (primary purpose, poudriere, and PostgreSQL, and running four FreshPorts nodes):
> >
> > >From top:
> >
> > ARC: 34G Total, 14G MFU, 9963M MRU, 22M Anon, 1043M Header, 9268M Other
> > 18G Compressed, 41G Uncompressed, 2.28:1 Ratio
> >
> > % grep arc /boot/loader.conf
> > vfs.zfs.arc_max="36000M"
> >
> > Looks like the value to set is:
> >
> > % sysctl -a vfs.zfs.arc | grep max
> > vfs.zfs.arc.max: 37748736000
> >
> > Perhaps not a good example, but this might be more appropriate:
> >
> > % grep vfs.zfs.arc.max /boot/loader.conf
> > vfs.zfs.arc_max="1200M"
> >
> > with top showing:
> >
> > ARC: 1198M Total, 664M MFU, 117M MRU, 3141K Anon, 36M Header, 371M Other
> > 550M Compressed, 1855M Uncompressed, 3.37:1 Ratio
>
> Thank you, Dan, I appreciate you chiming in.
>
> Unfortunately, I think I have those bases covered, although I'm open to
> anything I may have missed:
>
> # grep -i arc /boot/loader.conf /etc/sysctl.conf
> /boot/loader.conf:vfs.zfs.arc.max=4294967296
> /boot/loader.conf:vfs.zfs.arc_max=4294967296
> /boot/loader.conf:vfs.zfs.arc.min=2147483648
> /etc/sysctl.conf:vfs.zfs.arc_max=4294967296
> /etc/sysctl.conf:vfs.zfs.arc.max=4294967296
> /etc/sysctl.conf:vfs.zfs.arc.min=2147483648
>
> # top -b
> last pid: 16257; load averages: 0.80, 1.15, 1.18 up 0+02:03:34 12:05:06
> 55 processes: 2 running, 53 sleeping
> CPU: 11.7% user, 0.0% nice, 18.4% system, 0.1% interrupt, 69.9% idle
> Mem: 32M Active, 141M Inact, 11G Wired, 3958M Free
> ARC: 10G Total, 5143M MFU, 4679M MRU, 2304K Anon, 44M Header, 219M Other
> 421M Compressed, 4744M Uncompressed, 11.28:1 Ratio
>
> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
> 11057 root 1 127 0 59M 33M CPU0 0 60:16 82.28% ssh
> 11056 root 5 24 0 22M 12M pipewr 3 6:00 6.25% zfs
> 1619 snmpd 1 20 0 34M 14M select 0 0:06 0.00% snmpd
> 1344 root 1 20 0 14M 3884K select 3 0:03 0.00% devd
> 1544 root 1 20 0 13M 2776K select 3 0:01 0.00% syslogd
> 1661 root 1 68 0 22M 9996K select 0 0:01 0.00% sshd
> 1587 ntpd 1 20 0 23M 5876K select 1 0:00 0.00% ntpd
> 14391 root 1 20 0 22M 11M select 3 0:00 0.00% sshd
> 2098 root 1 20 0 24M 11M select 1 0:00 0.00% httpd
> 1904 root 1 20 0 24M 11M select 2 0:00 0.00% httpd
> 1870 root 1 20 0 19M 8688K select 2 0:00 0.00% sendmail
> 2067 root 1 20 0 19M 8688K select 1 0:00 0.00% sendmail
> 2066 65529 1 20 0 13M 4564K select 2 0:00 0.00% mathlm
> 1883 65529 1 20 0 11M 2772K select 3 0:00 0.00% mathlm
> 14397 root 1 20 0 14M 4568K wait 1 0:00 0.00% bash
> 1636 root 1 20 0 13M 2608K nanslp 0 0:00 0.00% cron
> 2082 root 1 20 0 13M 2560K nanslp 3 0:00 0.00% cron
> 1887 root 1 20 0 13M 2568K nanslp 2 0:00 0.00% cron
>
> # sysctl -a | grep m.u_evictable
> kstat.zfs.misc.arcstats.mfu_evictable_metadata: 0
> kstat.zfs.misc.arcstats.mfu_evictable_data: 0
> kstat.zfs.misc.arcstats.mru_evictable_metadata: 0
> kstat.zfs.misc.arcstats.mru_evictable_data: 0
>
> An mrtg graph is attached showing ARC bytes used
> (kstat.zfs.misc.arcstats.size) in green, vs. ARC bytes max
> (vfs.zfs.arc.max) in blue. We can see that daily, the ARC bytes used
> blows right past the 4G limit. Most days, it is brought under control
> by two reboots in /etc/crontab ("shutdown -r now" at 02:55, 05:35),
> although some days the system is too far gone by the time the cron job
> rolls around, and the system stays hung until I can get to the data
> center and power cycle it.
>
> I'm not very skilled at kernel debugging, but is a kernel PR in order?
> This has happened with a GENERIC kernel across at least two builds of
> 14-STABLE:
>
> FreeBSD 14.0-STABLE #0 stable/14-n267062-77205dbc1397: Thu Mar 28 12:12:02 PDT 2024
> FreeBSD 14.1-STABLE #0 stable/14-n267886-4987c12cb878: Thu Jun 6 12:24:06 PDT 2024
>
> Would it help to reproduce this with a -RELEASE version?
>
>
> Thank you again, everyone.
>
> Jim