small patch for pageout. Comments?
Larry McVoy
lm at mcvoy.com
Thu Nov 30 21:09:20 UTC 2017
On Thu, Nov 30, 2017 at 04:01:31PM -0500, Mark Johnston wrote:
> On Thu, Nov 30, 2017 at 12:50:41PM -0800, Larry McVoy wrote:
> > On Thu, Nov 30, 2017 at 03:47:50PM -0500, Mark Johnston wrote:
> > > > I dunno if there is a "right amount". I could make it a little smarter by
> > > > keeping track of how many pages we freed and sleep if we freed none in a
> > > > scan (which seems really unlikely).
> > >
> > > This situation can happen if the inactive queue is full of dirty pages.
> > > A problem with your patch is that we might not give enough time to the
> > > laundry thread (the thread responsible for writing the contents of dirty
> > > pages to disk and returning them to inactive queue for the page daemon
> > > to free) to write out dirty pages. In this case we might trigger the OOM
> > > killer prematurely, and in fact this scenario is what motivated r300865.
> > > So I would argue that we do in fact need to sleep if the page daemon is
> > > failing to make progress, in order to give time for I/O to complete.
> >
> > OK, that sounds reasonable. So what defines progress? v_dfree not
> > increasing? Is one page freed progress?
>
> One page freed is progress. We currently invoke the OOM killer only when
> the page daemon is making no progress. This turns out to be too
> conservative, which is what kib's patch attempts to address. wrt
> your patch, I'm saying that I think we should still sleep after a scan
> that failed to free any pages.
Something like this?
--- a/sys/vm/vm_pageout.c
+++ b/sys/vm/vm_pageout.c
@@ -1752,6 +1752,7 @@ vm_pageout_worker(void *arg)
struct vm_domain *domain;
int domidx, pass;
bool target_met;
+ u_int dfree;
domidx = (uintptr_t)arg;
domain = &vm_dom[domidx];
@@ -1776,6 +1777,7 @@ vm_pageout_worker(void *arg)
*/
while (TRUE) {
mtx_lock(&vm_page_queue_free_mtx);
+ dfree = VM_CNT_FETCH(v_dfree);
/*
* Generally, after a level >= 1 scan, if there are enough
@@ -1815,10 +1817,22 @@ vm_pageout_worker(void *arg)
* (page reclamation) scan, then increase the level
* and scan again now. Otherwise, sleep a bit and
* try again later.
+ *
+ * If we have more than one CPU this pause is not
+ * helpful, it just decreases the rate at which we
+ * clean pages. On a uniprocessor we want to pause
+ * to let the user level processes get some time to
+ * run. We also don't keep banging on the page tables
+ * if we didn't manage to free any in the last pass.
*/
mtx_unlock(&vm_page_queue_free_mtx);
- if (pass >= 1)
- pause("psleep", hz / VM_INACT_SCAN_RATE);
+ if (pass >= 1) {
+ dfree = VM_CNT_FETCH(v_dfree) - dfree;
+ if ((dfree == 0) || (mp_ncpus < 2)) {
+if (!dfree) printf("Sleeping because pass %d didn't find anything\n", pass);
+ pause("psleep", hz / VM_INACT_SCAN_RATE);
+ }
+ }
pass++;
} else {
/*
More information about the freebsd-arch
mailing list