[SOLVED] Re: Strange behavior after running under high load
phk at phk.freebsd.dk
Sun Apr 4 19:01:54 UTC 2021
Konstantin Belousov writes:
> But what would you provide as the input for PID controller, and what would be the targets?
Viewing this purely as a vnode related issue is wrong, this is about memory allocation in general.
We may or may not want a PID regulator, but putting it on counts of vnode would not improve things, precisely, as you point out, because the amount of memory a vnode ties up has enormous variance.
We should focus on the end goal: To ensure "sufficient" memory can always be allocated for any purpose "without major delay".
Architecturally there are three major problems:
A) While each subsystem generally have a good idea about memory that can be released "without major delay", the information does not trickle up through a summarizing NUMA aware tree.
B) We lack a nuanced call-back to tell the subsystems to release some of their memory "without major delay".
C) We have never attempted to enlist userland, where jemalloc often hang on to a lot of unused VM pages.
As far as vnodes go:
It used to be that "without major delay" meant "without disk-I/O" which again led to the "dirty buffers/VM pages" heuristic.
With microsecond SSD backing store, that heuristic is not only invalid, it is down-right harmful in many cases.
GEOM maintains estimates of per-provider latency and VM+VFS should use that to schedule write-back so that more of it happens outside rush-hour, in order to increase the amount of memory which can be released "without major delay".
Today that happens largely as a side effect of the periodic syncer, which does a really bad job at it, because it still expects VAX-era hardware performance and workloads.
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
More information about the freebsd-current