5.x w/auto-maxusers has insane kern.maxvnodes

Sat May 8 22:18:34 PDT 2004

Brian Fundakowski Feldman <green at FreeBSD.org> wrote:
> I have a 512MB system and had to adjust kern.maxvnodes (desiredvnodes) down 
> to something reasonable after discovering that it was the sole cause of too 
> much paging for my workstation.  The target number of vnodes was set to 
> 33000, which would not be so bad if it did not also cause so many more 
> UFS, VM and VFS objects, and the VM objects' associated inactive cache 
> pages, lying around.  I ended up saving a good 100MB of memory just 
> adjusting kern.maxvnodes back down to something reasonable.  Here are the 
> current allocations (and some of the peak values):
> 
> ITEM            SIZE     LIMIT     USED    FREE  REQUESTS
> FFS2 dinode:     256,        0,  12340,     95,  1298936
> FFS1 dinode:     128,        0,    315,   3901,  2570969
> FFS inode:       140,        0,  12655,  14589,  3869905
> L VFS Cache:     291,        0,      5,    892,    51835
> S VFS Cache:      68,        0,  13043,  23301,  4076311
> VNODE:           260,        0,  32339,     16,    32339
> VM OBJECT:       132,        0,  10834,  24806,  2681863
> (The number of VM pages allocated specifically to vnodes is not something 
> easy to determine other than the fact that I saved so much memory even 
> without the objects themselves, after uma_zfree(), having been reclaimed.)
> 
> We really need to look into making the desiredvnodes default target more 
> sane before 5.x is -STABLE or people are going to be very surprised 
> switching from 4.x and seeing paging increase substantially.  One more 
> surprising thing is how many of these objects cannot be reclaimed because of 
> they are UMA_ZONE_NOFREE or have no zfree function.  If they were, I'd have 
> an extra 10MB back right now in my specific case, having just reduced the 
> kern.maxvnodes setting and did a failed-umount on every partition to force 
> the vnodes to be flushed.
> 
> The vnodes are always kept on the free vnode list after free because they 
> might still be used again without having flushed out all of their associated 
> VFS information -- but they should always be in a state that the list can be 
> rescanned so they can actually be reclaimed by UMA if it asks for them.  All 
> of the rest should need very little in the way of supporting uma_reclaim(),
> but why are they not already like that?  One last good example I personally 
> see of wastage-by-virtue-of-zfree-function is the page tables on i386:
> PV ENTRY:         28,   938280,  59170, 120590, 199482221
> Once again, why do those actually need to be non-reclaimable?

Anyone have any ideas, yet?  Here's what the system in question's memory is 
like, and what maxusers value gets auto-picked:

vm.vmtotal:
System wide totals computed every five seconds: (values in kilobytes)
===============================================
Processes:              (RUNQ: 3 Disk Wait: 3 Page Wait: 0 Sleep: 76)
Virtual Memory:         (Total: 1535K, Active 521056K)
Real Memory:            (Total: 392912K Active 230436K)
Shared Virtual Memory:  (Total: 122836K Active: 106764K)
Shared Real Memory:     (Total: 19880K Active: 16032K)
Free Memory Pages:      126128K
kern.maxusers: 251

Here's where the relevant things get defined:
subr_param.c:#define    NPROC (20 + 16 * maxusers)
subr_param.c:   maxproc = NPROC;
        desiredvnodes = min(maxproc + cnt.v_page_count / 4, 2 * vm_kmem_size /
            (5 * (sizeof(struct vm_object) + sizeof(struct vnode))));
        minvnodes = desiredvnodes / 4;

It really doesn't seem appropriate to _ever_ scale maxvnodes (desiredvnodes) 
up that high just because I have 512MB of RAM.

-- 
Brian Fundakowski Feldman                           \'[ FreeBSD ]''''''''''\
  <> green at FreeBSD.org                               \  The Power to Serve! \
 Opinions expressed are my own.                       \,,,,,,,,,,,,,,,,,,,,,,\