svn commit: r332365 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs

Slawa Olhovchenkov slw at zxy.spb.ru
Tue Apr 10 16:02:41 UTC 2018


On Tue, Apr 10, 2018 at 11:17:33AM -0400, Mark Johnston wrote:

> On Tue, Apr 10, 2018 at 05:09:57PM +0300, Slawa Olhovchenkov wrote:
> > On Tue, Apr 10, 2018 at 01:56:06PM +0000, Mark Johnston wrote:
> > 
> > > Author: markj
> > > Date: Tue Apr 10 13:56:06 2018
> > > New Revision: 332365
> > > URL: https://svnweb.freebsd.org/changeset/base/332365
> > > 
> > > Log:
> > >   Set zfs_arc_free_target to v_free_target.
> > >   
> > >   Page daemon output is now regulated by a PID controller with a setpoint
> > >   of v_free_target. Moreover, the page daemon now wakes up regularly
> > >   rather than waiting for a wakeup from another thread. This means that
> > >   the free page count is unlikely to drop below the old
> > >   zfs_arc_free_target value, and as a result the ARC was not readily
> > >   freeing pages under memory pressure. Address the immediate problem by
> > >   updating zfs_arc_free_target to match the page daemon's new behaviour.
> > 
> > Can you explain some more about new page daemon algo (and reclaim zone
> > free memory)?
> 
> The old algorithm was pretty simple: there was a free page target and
> below that, a wakeup threshold. Any time a thread allocated a page and
> in so doing caused the free page count to drop below the wakeup
> threshold, that thread would wake up the page daemon, which would scan
> the inactive queue and free pages until the free target is reached, or
> the end of the inactive queue was reached.
> 
> This is simple and easy to reason about, but has some drawbacks. When
> memory pressure is constant, it leads to bursts of CPU usage and lock
> contention. The static watermarks may also be insufficient for some
> demanding workloads. In particular, the wakeup threshold might be too
> low, thus allowing the free page count to drop to dangerous levels and
> triggering expensive memory shortage handling (i.e., VM_WAIT).
> 
> The new algorithm uses a control loop to dynamically compute a target
> for each scan of the inactive queue. The loop takes as input the
> magnitude of the page shortage (v_free_target - v_free_count) and keeps
> track of the rate of change of this difference (i.e., the rate at which
> free pages are being consumed) and the sum of this difference over time
> (i.e., a cumulative value for the magnitude of recent page shortages).
> These factors are used to compute "shortage", the number of pages to
> reclaim with the goal of maintaining a free page count of v_free_target.
> 
> The effect of the new algorithm is that the page daemon runs more
> frequently but for shorter durations, so its CPU usage is more even. It
> responds dynamically to the demands of the workload, so the shortcomings
> of a pair of static watermarks are gone.
> 
> r329882 doesn't really change anything with respect to reclamation of
> pages from UMA zones. There are some plans to address shortcomings there
> in the near future though.

Thank, very nice explain.
IMHO, in this case for ZFS best is old zfs_arc_free_target: too close
zfs_arc_free_target to vm_cnt.v_free_min can cause too often run
arc_target correction and cause CPU consumption and memory subsystem
overuse.
ZFS need more correct pressure, ZFS-specific, and I am try it in D7538.

> > PS: zfs need some more time for free pages from ARC. Also, vanila zfs
> > have broken logic for count used and free ARC's memory. For most
> > correctly count system-wide used and free memory need accounting
> > in-zone free memory.
> 
> Yes, there is a number of problems in this area predating r329882. This
> commit is really just a bandaid.


More information about the svn-src-all mailing list