svn commit: r351673 - in head: lib/libmemstat share/man/man9 sys/cddl/compat/opensolaris/kern sys/kern sys/vm

Sat Sep 7 14:32:48 UTC 2019

On Wed, Sep 04, 2019 at 09:00:03AM +0300, Andriy Gapon wrote:
> On 04/09/2019 01:01, Mark Johnston wrote:
> > Slawa and I talked about this in the past.  His complaint is that a
> > large cache can take a significant amount of time to trim, and it
> > manifests as a spike of CPU usage and contention on the zone lock.  In
> > particular, keg_drain() iterates over the list of free slabs with the
> > keg lock held, and if the many items were freed to the keg while
> > trimming/draining, the list can be quite long.  This can have effects
> > outside the zone, for example if we are reclaiming items from zones used
> > by other UMA zones, like the bucket or slab zones.
> 
> My concern is different, though.
> I feel that having oversized caches for long periods of time produces a skewed
> picture of memory usage.  Particularly, some ZFS caches are sometimes extremely
> oversized.  I don't care much about details of consequences of such oversized
> caches.  I just think that that is not right on a more general level.
> 
> > Reclaiming cached items when there is no demand for free pages seems
> > wrong to me.
> 
> It certainly was wrong before.
> Now that we have a capability to trim a cache size to a workset size it doesn't
> feel as wrong to me.

One partial problem is that some UMA items are expensive to allocate and
free.  On amd64 slabs larger than the page size must be mapped into
KVA, and this is not a scalable operation: pages must be inserted into
and removed from kernel_object, and when removed we must issue a TLB
shootdown to all CPUs.  Proactively freeing such items from the cache
might also exacerbate fragmentation over time.

For direct-mapped items I think the tradeoff makes more sense and we
could indeed start regularly freeing items based on the current WSS
estimation.  There has been a lot of work in the past year or two to
make the page allocator cheaper and more scalable.

> > We historically had similar problems with the page daemon,
> > which last year was changed to perform smaller reclamations at a greater
> > frequency.  I suspect a better approach for UMA would be to similarly
> > increase reclaim frequency and reduce the number of items freed in one
> > go.