svn commit: r332365 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs

Mark Johnston markj at FreeBSD.org
Tue Apr 10 15:17:39 UTC 2018


On Tue, Apr 10, 2018 at 05:09:57PM +0300, Slawa Olhovchenkov wrote:
> On Tue, Apr 10, 2018 at 01:56:06PM +0000, Mark Johnston wrote:
> 
> > Author: markj
> > Date: Tue Apr 10 13:56:06 2018
> > New Revision: 332365
> > URL: https://svnweb.freebsd.org/changeset/base/332365
> > 
> > Log:
> >   Set zfs_arc_free_target to v_free_target.
> >   
> >   Page daemon output is now regulated by a PID controller with a setpoint
> >   of v_free_target. Moreover, the page daemon now wakes up regularly
> >   rather than waiting for a wakeup from another thread. This means that
> >   the free page count is unlikely to drop below the old
> >   zfs_arc_free_target value, and as a result the ARC was not readily
> >   freeing pages under memory pressure. Address the immediate problem by
> >   updating zfs_arc_free_target to match the page daemon's new behaviour.
> 
> Can you explain some more about new page daemon algo (and reclaim zone
> free memory)?

The old algorithm was pretty simple: there was a free page target and
below that, a wakeup threshold. Any time a thread allocated a page and
in so doing caused the free page count to drop below the wakeup
threshold, that thread would wake up the page daemon, which would scan
the inactive queue and free pages until the free target is reached, or
the end of the inactive queue was reached.

This is simple and easy to reason about, but has some drawbacks. When
memory pressure is constant, it leads to bursts of CPU usage and lock
contention. The static watermarks may also be insufficient for some
demanding workloads. In particular, the wakeup threshold might be too
low, thus allowing the free page count to drop to dangerous levels and
triggering expensive memory shortage handling (i.e., VM_WAIT).

The new algorithm uses a control loop to dynamically compute a target
for each scan of the inactive queue. The loop takes as input the
magnitude of the page shortage (v_free_target - v_free_count) and keeps
track of the rate of change of this difference (i.e., the rate at which
free pages are being consumed) and the sum of this difference over time
(i.e., a cumulative value for the magnitude of recent page shortages).
These factors are used to compute "shortage", the number of pages to
reclaim with the goal of maintaining a free page count of v_free_target.

The effect of the new algorithm is that the page daemon runs more
frequently but for shorter durations, so its CPU usage is more even. It
responds dynamically to the demands of the workload, so the shortcomings
of a pair of static watermarks are gone.

r329882 doesn't really change anything with respect to reclamation of
pages from UMA zones. There are some plans to address shortcomings there
in the near future though.

> PS: zfs need some more time for free pages from ARC. Also, vanila zfs
> have broken logic for count used and free ARC's memory. For most
> correctly count system-wide used and free memory need accounting
> in-zone free memory.

Yes, there is a number of problems in this area predating r329882. This
commit is really just a bandaid.


More information about the svn-src-head mailing list