ZFS arc_reclaim_needed: better cooperation with pagedaemon

Artem Belevich fbsdlist at src.cx
Mon Aug 23 00:14:26 UTC 2010

Do you by any chance have a graph showing kstat.zfs.misc.arcstats.size
behavior in addition to the stuff included on your graphs now?  All I
can tell from your graphs is that v_free_count+v_cache_count shifted a
bit lower relative to v_free_target+v_cache_min. It would be
interesting to see what effect your patch has on ARC itself,
especially when ARC will start giving up memory and when does it stop


On Sun, Aug 22, 2010 at 2:46 PM, Andriy Gapon <avg at freebsd.org> wrote:
> I propose that the following code in arc_reclaim_needed
> (sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c)
> /*
>  * If pages are needed or we're within 2048 pages
>  * of needing to page need to reclaim
>  */
> if (vm_pages_needed || (vm_paging_target() > -2048))
> be changed to
> if (vm_paging_needed())
> Rationale.
> 1. Why not current checks.
> ARC sizing should cooperate with pagedaemon in freeing pages.
> If ARC starts shrinking "prematurely", before pagedaemon is waked up then no
> potentially eligible inactive pages will be recycled and no potentially eligible
> active pages will be inactive (subject to v_inactive_target).
> This would lead to ARC size going to its minimum value (which could hurt ZFS
> performance).  Only after that there is a chance that pagedaemon would be waked
> up to do its cleaning.
> And conversely, if ARC doesn't shrink in time, then pagedaemon would have to
> recycle pages with data that could be needed again soon and that would lead to
> excessive swapping and disk I/O.
> vm_paging_target() is used only by pagedaemon internally, it effectively sets
> _upper_ limit on how many pages pagedaemon would free when it's activated.
> It is no indication of whether pagedaemon should be scanning/freeing pages.
> Thus check of vm_paging_target() leads to premature ARC shrinking.
> I believe that many people observe this behavior on sufficiently active systems
> (not dedicated file servers) with few GB of RAM (1-8).
> vm_pages_needed check is redundant, because this is a flag that is used to wake
> up pagedaemon.  So when it is set vm_paging_needed() is true and
> vm_paging_target() is "way" above zero.  And this flag is reset to zero when
> vm_page_count_min() becomes false, which corresponds to even fewer free pages
> than when vm_paging_needed() is true.
> 2. Why the new check.
> vm_paging_needed() is the (earliest) condition that wakes up pagedaemon (see
> vm_page_alloc).  pagedaemon would first of all run vm_lowmem event for which ARC
> already has a handler and which would cause ARC size to shrink.
> It would seems like having vm_paging_needed() check would be redundant then.
> Almost - if memory pressure is significant, then vm_paging_needed() may stay
> true for a while and that would cause additional ARC reduction by
> arc_reclaim_thread.
> Final notes.
> I think that
> vm_paging_target() > -2048
> check was modeled after the check in the original OpenSolaris code:
> freemem < lotsfree + needfree + extra
> The issue is that in my understanding OpenSolaris pagedaemon works differently
> from FreeBSD pagedaemon.
> OpenSolaris pagedaemon is activated when freemem (equivalent of our free +
> cache) falls down to a certain higher mark (lotsfree).  Initially it scans pages
> at a slow rate.  If freemem falls further the rate linearly increases until it
> reaches its maximum when freemem goes to or below certain lower mark.
> Our pagedaemon is activated when free + cache falls down to a value when
> vm_paging_needed() is true (see definition of this function).  When it is
> activated it makes a scan pass though inactive and active pages setting a
> certain target for free+cache, but that target is "soft" and actually is an
> upper limit of how many pages could be freed during the pass. pagedaemon would
> make the second (or subsequent) pass only if free+cache falls to value that is
> even lower than the threshold in vm_paging_needed(), which means significant
> (severe even) memory pressure/shortage.
> So on sufficiently active system free+cache would typically oscillate between
> v_free_reserved+v_cache_min at the bottom and some semi-random values "near"
> v_free_target+v_cache_min at the tops.  That's when excluding ARC from the picture.
> And about pictures :-)
> Behavior of free+cache with current arc_reclaim_needed code:
> http://people.freebsd.org/~avg/avail-mem-before.png
> and its behavior after the patch:
> http://people.freebsd.org/~avg/avail-mem-after.png
> The legends on the pictures are incorrect, sorry, my mastery of drraw is not
> good yet.
> Correct legends:
> "aqua" color - v_free_target+v_cache_min (vm_paging_target() == 0)
> "fuchsia" color - v_free_reserved+v_cache_min (vm_paging_needed() threshold)
> "lime" color - v_free_count+v_cache_count indeed :)
> Y axis - % of total page count.
> I think the graphs speak for themselves.
> --
> Andriy Gapon
> _______________________________________________
> freebsd-hackers at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe at freebsd.org"

More information about the freebsd-hackers mailing list