svn commit: r343030 - in head/sys: cam conf dev/md dev/nvme fs/fuse fs/nfsclient fs/smbfs kern sys ufs/ffs vm

Mark Johnston markj at freebsd.org
Thu Feb 14 16:12:00 UTC 2019


On Thu, Feb 14, 2019 at 06:56:42PM +1100, Bruce Evans wrote:
> On Wed, 13 Feb 2019, Justin Hibbits wrote:
> 
> > On Tue, 15 Jan 2019 01:02:17 +0000 (UTC)
> > Gleb Smirnoff <glebius at FreeBSD.org> wrote:
> >
> >> Author: glebius
> >> Date: Tue Jan 15 01:02:16 2019
> >> New Revision: 343030
> >> URL: https://svnweb.freebsd.org/changeset/base/343030
> >>
> >> Log:
> >>   Allocate pager bufs from UMA instead of 80-ish mutex protected
> >> linked list.
> > ...
> >
> > This seems to break 32-bit platforms, or at least 32-bit book-e
> > powerpc, which has a limited KVA space (~500MB).  It preallocates I've
> > seen over 2500 pbufs, at 128kB each, eating up over 300MB KVA,
> > leaving very little left for the rest of runtime.
> 
> Hrmph.  I complained other things in this commit this when it was
> committed, but not this largest bug since preallocation was broken then
> so I thought that it wasn't done, so that problems are smaller unless the
> excessive limits are actually reached.
> 
> Now i386 does it:
> 
> XX ITEM                   SIZE  LIMIT     USED     FREE      REQ FAIL SLEEP
> XX 
> XX swrbuf:                 336,    128,       0,       0,       0,   0,   0
> XX swwbuf:                 336,     64,       0,       0,       0,   0,   0
> XX nfspbuf:                336,    128,       0,       0,       0,   0,   0
> XX mdpbuf:                 336,     25,       0,       0,       0,   0,   0
> XX clpbuf:                 336,    128,       0,       5,       4,   0,   0
> XX vnpbuf:                 336,   2048,       0,       0,       0,   0,   0
> XX pbuf:                   336,     16,       0,    2535,       0,   0,   0
> 
> but i386 now has 4GB of KVA, with almost 3GB to waste, so the bug is not
> noticed there.
> 
> The preallocation wasn't there in my last mail to the author about nearby
> bugs, on 24 Jan 2019:
> 
> YY vnpbuf:                 568,   2048,       0,       0,       0,   0,   0
> YY clpbuf:                 568,    128,       0,     128,    8750,   0,   1
> YY pbuf:                   568,     16,       0,       4,       0,   0,   0
> 
> This output is on amd64 where the SIZE is larger and everything else was
> the same as on i386.  Now amd64 shows the large preallocation too.
> 
> There seems to be another bug for the especially small LIMIT of 16 to
> turn into a preallocation of 2535 and not cause immediate reduction to
> the limit.
> 
> I happen to have kernels from 24 and 25 Jan handy.  The first one is
> amd64 r343346M built on Jan 23, and it doesn't do the large
> preallocation.  The second one is i386 r343388:343418M built on Jan
> 25, and it does the large preallocation.  Both call uma_prealloc() to
> ask for nswbuf_max = 0x9e9 buffers, but the old version only allocates
> 4 buffers while later version allocate 0x9e9 buffers.
> 
> The only relevant commit between the good and bad versions seems to be
> r343453.  This fixes uma_prealloc() to actually work.  But it is a feature
> for it to not work when its caller asks for too much.

I guess you meant r343353.  In any case, the pbuf keg is _NOFREE, so
even without preallocation the large pbuf zone limits may become
problematic if there are bursts of allocation requests.

> 0x9e9 is the sum of the LIMITs of all pbuf pools.  The main bug in
> r343030 is that it expands nswbuf, which is supposed to give the
> combined limit, from its normal value of 256 to 0x9e9.  (r343030
> actually used nswbuf before it was properly initialized, so used its
> maximum value of 256 even on small systems with nswbuf = 16.  Only
> this has been fixed.)
> 
> On i386, nbuf is excessively limited so as to give a maxbufspace of
> about 100MB so as to fit in 1GB of kva even with infinite RAM and
> -current's actual 4GB of kva.  nbuf is correctly limited to give a
> much smaller maxbufspace when RAM is small (kva scaling for this is
> not done so well).  nswbuf is restricted if nbuf is restricted, but
> not enough (except in my version).  It is normally 256, so the pbuf
> allocation used to be 32MB, and this is already a bit large compared
> with 100MB for maxbufspace.  Expanding pbufs by a factor of 0x9e9/0x100
> gives the silly combination of 100MB for maxbufspace and 317MB for
> pbufs.
> 
> If kva is only 512MB instead of 1GB, then maxbufspace should be only
> 50MB and nswbuf should be smaller too.  Similarly for PAE on i386 back
> when it was configured with 1GB kva by default.  Only about 512MB are
> left after allocating space for page table metadata.  I have fixes
> that scale most of this better.  Large subsystems starting with kmem
> get a hard-coded fraction of the usable kva.  E.g., kmem gets about
> 60% of usable kva instead of about 40% of nominal kva.  Most other
> large subsystems including the buffer cache get about 1/8 of the
> remaining 40% of usable kva.  Scaling for other subsystems is mostly
> worse than for kmem.  pbufs are part of the buffer cache allocation.
> The expansion factor of 0x9e9/0x100 breaks this.
> 
> I don't understand how pbuf_preallocate() allocates for the other
> pbuf pools.  When I debugged this for clpbufs, the preallocation was
> not used.  pbuf types other than clpbufs seem to be unused in my
> configurations.  I thought that pbufs were used during initialization,
> since they end up with a nonzero FREE count, but their only use seems
> to be to preallocate them.

All of the pbuf zones share a common slab allocator.  The zones have
individual limits but can tap in to the shared preallocation.


More information about the svn-src-all mailing list