Re:_FYI;_14.3:_A_discord_report_o f_Wired_Memory_growing_to_17_GiBy tes_over_something_like_60_days;_ ARC_shrinks_to,_say,_1942_MiBytes

From: Sulev-Madis Silber <freebsd-stable-freebsd-org730_at_ketas.si.pri.ee>
Date: Wed, 13 Aug 2025 23:34:40 UTC
can anybody explain to me how to cap zfs related memory usage in kernel?

or even help me get it

after i learned that git pull on ports tree with ton updates takes machine down in seconds i managed to find and adjust to have this:

[core]
        packedGitWindowSize = 1m
        packedGitLimit = 1m
        preloadIndex = false
[diff]
        renameLimit = 1

i tried to get what those packedgit*'s do and i understand that they are caches. and somehow mmap is involved. dovecot also uses mmap which i disabled as things started to die and i nearly lost the sshd. tho i have software watchdog configured to reboot if all userland dies. seems to be enough as kernel never freezes

what i want to, or maybe we could have it as default, which one can override, is to limit zfs appearant ability to exhaust all kernel memory. doesn't matter how much ram machine has, i think if kernel aready occupies 95% of ram, now it's the time to slow down zfs operations

i'm used to this gradual slowdown. but apparently with zfs it's fast till the end. i wonder how it stops after filling all the ram tho, did the demand just stop due userland got killed off or would it eventually panic the kernel. and how to test it? in-kernel iscsi?

unsure how many would ever need this filesystem "battleshort" mode

that's a bug, right? i don't think the low ram or slow io would justify the failure either

i can think of nasty failure mode where vm with not uncommonly low ram could experience temporary io slowdown due bottlenecking somewhere and this could kill the whatever important thing the userland program is doing and instead usual lockup which could resolve one could just have immediate data loss

io is wild guess too

why is this ever an issue i don't even know. this is standard setup

you could blame me for not having tested this on anything other than 13 but at one case i had 100% success rate of taking system down with git. fresh boot, pull, it's gone. that's not what one expect even from low power systems

zfs has appealing options for embedded too, like copies=3, compression, etc. even if it was originally designed not to (but why?), why can't it just pause io. all other fses do this. there are number of real reasons why to do this

seems like zfs has serious issues of saying no and would rather die first

it's probably first time ever in my life seeing this

what's the fix or workaround i don't know

i get that zfs read and especially write operations are extremely complex but can't they like wait or something? just like iirc arc is limited to 60% of ram by default as more might be too insane. zfs could complete whatever atomic operation it needs to do and then just don't take new. io would stall but it could resolve

or at least tell me i'm wrong and problem is elsewhere

feels similar than this tho...