[Bug 290207] [ZFS] lowering "vfs.zfs.arc.max" to a low value causes kernel threads of "arc_evict" to use 91% CPU and disks to wait. System gets unresponsive...

From: <bugzilla-noreply_at_freebsd.org>
Date: Mon, 13 Oct 2025 14:51:26 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=290207

            Bug ID: 290207
           Summary: [ZFS] lowering "vfs.zfs.arc.max" to a low value causes
                    kernel threads of "arc_evict" to use 91% CPU and disks
                    to wait. System gets unresponsive...
           Product: Base System
           Version: 15.0-STABLE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: nbe@vkf-renzel.de

Hi,

under FreeBSD 15.0-BETA1 lowering the maximum ARC usage using the sysctl
"vfs.zfs.arc.max" to something like 2G, 1G or 512M eventually causes the kernel
threads of "arc_evict" to use 91% CPU (or more) and to have the disks wait.
This makes the whole system unresponsive.

To reproduce quickly:

sysctl vfs.zfs.arc.max=1073741824
zpool scrub <YOURPOOLNAME>

Then take a look at the outputs of "top 5" and "gstat":
---------------------------------- SNIP ---------------------------------- 
last pid: 16317;  load averages:    0.69,    0.36,    0.16; battery: 100%      
                                                                               
                                                    up 0+03:04:00  16:35:34
600 threads:   16 running, 530 sleeping, 54 waiting
CPU:  0.1% user,  0.0% nice, 24.5% system,  0.2% interrupt, 75.3% idle
Mem: 25M Active, 547M Inact, 2970M Wired, 10G Free
ARC: 1059M Total, 231M MFU, 722M MRU, 400K Anon, 16M Header, 89M Other
     858M Compressed, 4170M Uncompressed, 4.86:1 Ratio
Swap: 16G Total, 16G Free

  PID USERNAME    PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
    0 root         59    -     0B  4176K CPU11   11   4:00  92.83%
kernel{arc_evict_2}
    0 root         59    -     0B  4176K CPU8     8   4:19  92.48%
kernel{arc_evict_1}
    0 root         59    -     0B  4176K CPU1     1   3:59  92.01%
kernel{arc_evict_0}
    6 root        -13    -     0B  1616K aw.aew   6   0:52   7.18%
zfskern{txg_thread_enter}
    6 root          1    -     0B  1616K tq_adr   2   0:16   2.49%
zfskern{arc_evict}

(while)

dT: 1.000s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0    206    206   3295  0.156      0      0  0.000    3.2| nda0
    0    123    123   1967  0.192      0      0  0.000    2.4| nda1
---------------------------------- SNIP ---------------------------------- 


The disks are NVME SSDs capable of 800MB/s and a lot of IOPs. Their normal
stats are:
---------------------------------- SNIP ----------------------------------
dT: 1.000s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    3   6406   6406 803622  0.465      0      0  0.000  100.0| nda0
    0   7369   7369 807316  0.076      0      0  0.000   50.8| nda1
---------------------------------- SNIP ----------------------------------


To get back to somewhat normal behaviour, you have to set ARC's maximum to a
higher value:

sysctl vfs.zfs.arc.max=8589934592

(Setting it to 0 [zero] does not help.)

This misbehaviour did not happen under the original, "old" FreeBSD ZFS codebase
like under 11.1-STABLE. As far as I remember it also did not happen under
12.4-RELEASE. My old poudriere building system (old 8-core Ryzen, 16GB RAM) was
using the "old" ZFS codebase and ARC was limited to 1GB in order to give
poudriere enough RAM for TMPFS for its eight worker threads back then. No
problems at all.

In my understanding when I limit the cache to a too small value then the
accesses to the disks should increase vastly but should NOT generate some
eviction threads that causes the disks to wait for them...



Thanks for looking into it and regards,
Nils

-- 
You are receiving this mail because:
You are the assignee for the bug.