Re: The pagedaemon evicts ARC before scanning the inactive page list

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Wed, 19 May 2021 04:17:02 UTC
On Tue, May 18, 2021 at 09:55:25PM -0600, Alan Somers wrote:
> On Tue, May 18, 2021 at 9:25 PM Konstantin Belousov <kostikbel@gmail.com>
> > Is your machine ZFS-only?  If yes, then typical source of inactive memory
> > can be of two kinds:
> >
> 
> No, there is also FUSE.  But there is typically < 1GB of Buf memory, so I
> didn't mention it.
As Mark mentioned, buffers use page cache as second-level cache. More
precisely, there is relatively limited number of buffers in the system,
which are just headers to describe a set of pages. When a buffer is
recycled, its pages are put on inactive queue.

This is why I asked is your machine ZFS-only or not, because io on
bufcache-using filesystems typically add to the inactive queue.

> 
> 
> > - anonymous memory that apps allocate with facilities like malloc(3).
> >   If inactive is shrinkable then it is probably not, because dirty pages
> >   from anon objects must go through laundry->swap route to get evicted,
> >   and you did not mentioned swapping
> >
> 
> No, there's no appreciable amount of swapping going on.  Nor is the laundry
> list typically more than a few hundred MB.
> 
> 
> > - double-copy pages cached in v_objects of ZFS vnodes, clean or dirty.
> >   If unmapped, these are mostly a waste.  Even if mapped, the source
> >   of truth for data is ARC, AFAIU, so they can be dropped as well, since
> >   inactive state means that its content is not hot.
> >
> 
> So if a process mmap()'s a file on ZFS and reads from it but never writes
> to it, will those pages show up as inactive?
It depends on workload, and it does not matter much if the pages are clean
or dirty.  Right after mapping or under intense access pattern, they sit
on the active list.  If not touched long enough, or cycled through the
buffer cache for io (but ZFS pages not go through buffer cache), they
are moved to inactive.

> 
> 
> >
> > You can try to inspect the most outstanding objects adding to the
> > inactive queue with 'vmobject -o' to see where the most of inactive pages
> > come from.
> >
> 
> Wow, that did it!  About 99% of the inactive pages come from just a few
> vnodes which are used by the FUSE servers.  But I also see a few large
> entries like
> 1105308 333933 771375   1   0 WB  df
> what does that signify?
These are anonymous memory.

> 
> 
> >
> > If indeed they are double-copy, then perhaps ZFS can react even to the
> > current primitive vm_lowmem signal somewhat different. First, it could
> > do the pass over its vnodes and
> > - free clean unmapped pages
> > - if some targets are not met after that, laundry dirty pages,
> >   then return to freeing clean unmapped pages
> > all that before ever touching its cache (ARC).
> >