Re: RFC: How ZFS handles arc memory use
- In reply to: Alexander Motin : "Re: RFC: How ZFS handles arc memory use"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 22 Oct 2025 15:21:42 UTC
On Wed, Oct 22, 2025 at 8:05 AM Alexander Motin <mav@freebsd.org> wrote: > > Hi Rick, > > On 22.10.2025 10:34, Rick Macklem wrote: > > A couple of people have reported problems with NFS servers, > > where essentially all of the system's memory gets exhausted. > > They see the problem on 14.n FreeBSD servers (which use the > > newer ZFS code) but not on 13.n servers. > > > > I am trying to learn how ZFS handles arc memory use to try > > and figure out what can be done about this problem. > > > > I know nothing about ZFS internals or UMA(9) internals, > > so I could be way off, but here is what I think is happening. > > (Please correct me on this.) > > > > The L1ARC uses uma_zalloc_arg()/uma_zfree_arg() to allocate > > the arc memory. The zones are created using uma_zcreate(), > > so they are regular zones. This means the pages are coming > > from a slab in a keg, which are wired pages. > > > > The only time the size of the slab/keg will be reduced by ZFS > > is when it calls uma_zone_reclaim(.., UMA_RECLAIM_DRAIN), > > which is called by arc_reap_cb(), triggered by arc_reap_cb_check(). > > > > arc_reap_cb_check() uses arc_available_memory() and triggers > > arc_reap_cb() when arc_available_memory() returns a negative > > value. > > > > arc_available_memory() returns a negative value when > > zfs_arc_free_target (vfs.zfs.arc.free_target) is greater than freemem. > > (By default, zfs_arc_free_target is set to vm_cnt.v_free_taget.) > > > > Does all of the above sound about right? > > There are two mechanisms to reduce ARC size: either from ZFS side in the > way you described, or from kernel side, when it calls ZFS low memory > handler arc_lowmem(). It feels somewhat overkill, but it came this way > from Solaris. > > Once ARC size is reduced and evictions into UMA caches happened, it is > up to UMA how to drain its caches. ZFS might trigger that itself, or it > can be done by kernel, or few years back I've added a mechanism for UMA > caches to slowly shrink by themselves even without pressure. > > > This leads me to... > > - zfs_arc_free_target (vfs.zfs.arc.free_target) needs to be larger > > There is a very delicate balance between ZFS and kernel > (zfs_arc_free_target = vm_cnt.v_free_target). Imbalance there makes one > of them suffer. > > > or > > - Most of the wired pages in the slab are per-cpu, > > so the uma_zone_reclaim() needs to UMA_RECLAIM_DRAIN_CPU > > on some systems. (Not the small test systems I have, where I > > cannot reproduce the problem.) > > Per-CPU caches should be relatively small. IIRC in dozens or hundreds of > allocations per CPU. Their drain is expensive and should rarely be > needed, unless you have too little RAM for the number of CPUs you have. > > > or > > - uma_zone_reclaim() needs to be called under other > > circumstances. > > or > > - ??? > > > > How can you tell if a keg/slab is per-cpu? > > (For my simple test system, I only see "UMA Slabs 0:" and > > "UMA Slabs 1:". It looks like UMA Slabs 0: is being used for > > ZFS arc allocation for this simple test system.) > > > > Hopefully folk who understand ZFS arc allocation or UMA > > can jump in and help out, rick > > Before you dive into UMA, have you checked whether ARC size really > shrinks and eviction happens? Considering you mention NFS, I wonder > what is your number of open files? Too many open files might in some > cases restrict ZFS ability to evict metadata from ARC. arc_summary may > give some insights about ARC state. I don't know if this helps, but the original post is here: https://lists.freebsd.org/archives/freebsd-stable/2025-September/003126.html Then you'll find the email thread that follows it here: https://lists.freebsd.org/archives/freebsd-stable/2025-September/003145.html Hopefully Garrett can respond with more information, rick > > -- > Alexander Motin