amd64: change VM_KMEM_SIZE_SCALE to 1?

Sat Jul 31 21:39:50 UTC 2010

John Baldwin wrote:
> On Friday, July 30, 2010 2:49:59 pm Alan Cox wrote:
>   
>> John Baldwin wrote:
>>     
>>> I have a strawman of that (relative to 7).  It simply adjusts the hardcoded 
>>> maximum to instead be a function of the amount of physical memory.
>>>
>>>   
>>>       
>> Unless I'm misreading this patch, it would allow "desiredvnodes" to grow 
>> (slowly) on i386/PAE starting at 5GB of RAM until we reach the (too 
>> high) "virt" limit of about 329,000.  Yes?  For example, an 8GB i386/PAE 
>> machine would have 60% more vnodes than was allowed by MAXVNODE_MAX, and 
>> it would not stop there.  I think that we should be concerned about 
>> that, because MAXVNODE_MAX came about because the "virt" limit wasn't 
>> working.
>>     
>
> Agreed.
>
>   
>> As the numbers above show, we could more than halve the growth rate for 
>> "virt" and it would have no effect on either amd64 or i386 machines with 
>> up to 1.5GB of RAM.  They would have just as many vnodes.  Then, with 
>> that slower growth rate, we could simply eliminate MAXVNODES_MAX (or at 
>> least configure it to some absurdly large value), thereby relieving the 
>> fixed cap on amd64, where it isn't needed.
>>
>> With that in mind, the following patch slows the growth of "virt" from 
>> 2/5 of vm_kmem_size to 1/7.  This has no effect on amd64.  However, on 
>> i386. it allows desiredvnodes to grow slowly for machines with 1.5GB to 
>> about 2.5GB of RAM, ultimately exceeding the old desiredvnodes cap by 
>> about 17%.  Once we exceed the old cap, we increase desiredvnodes at a 
>> marginal rate that is almost the same as your patch, about 1% of 
>> physical memory.  It's just computed differently.
>>
>> Using 1/8 instead of 1/7, amd64 machines with less than about 1.5GB lose 
>> about 7% of their vnodes, but they catch up and pass the old limit by 
>> 1.625GB.  Perhaps, more importantly, i386 machines only exceed the old 
>> cap by 3%.
>>
>> Thoughts?
>>     
>
> I think this is much better.  My strawman was rather hackish in that it was
> layering a hack on top of the existing calculations.  I prefer your approach.
> I do not think penalizing amd64 machines with less than 1.5GB is a big worry
> as most x86 machines with a small amount of memory are probably running as
> i386 anyway.  Given that, I would probably lean towards 1/8 instead of 1/7,
> but I would be happy with either one.
>
>   

I've looked a bit at an i386/PAE system with 8GB.  I don't think that a 
default configuration, e.g., no changes to the mbuf limits, is at risk 
with 1/7.

>> Index: kern/vfs_subr.c
>> ===================================================================
>> --- kern/vfs_subr.c     (revision 210504)
>> +++ kern/vfs_subr.c     (working copy)
>> @@ -284,21 +284,29 @@ SYSCTL_INT(_debug, OID_AUTO, vnlru_nowhere, CTLFLA
>>   * Initialize the vnode management data structures.
>>   */
>>  #ifndef        MAXVNODES_MAX
>> -#define        MAXVNODES_MAX   100000
>> +#define        MAXVNODES_MAX   8388608 /* Reevaluate when physmem 
>> exceeds 512GB. */
>>  #endif
>>     
>
> How is this value computed?  I would prefer something like:
>
> '512 * 1024 * 1024 * 1024 / (sizeof(struct vnode) + sizeof(struct vm_object) / N'
>
> if that is how it is computed.  A brief note about the magic number of 393216
> would also be nice to have (and if it could be a constant with a similar
> formula value that would be nice, too.).
>
>   

I've tried to explain this computation below.

Index: kern/vfs_subr.c
===================================================================

--- kern/vfs_subr.c     (revision 210702)
+++ kern/vfs_subr.c     (working copy)
@@ -282,23 +282,34 @@ SYSCTL_INT(_debug, OID_AUTO, vnlru_nowhere, CTLFLA
 
 /*
  * Initialize the vnode management data structures.
+ *
+ * Reevaluate the following cap on the number of vnodes after the physical
+ * memory size exceeds 512GB.  In the limit, as the physical memory size
+ * grows, the ratio of physical pages to vnodes approaches sixteen to one.
  */
 #ifndef        MAXVNODES_MAX
-#define        MAXVNODES_MAX   100000
+#define        MAXVNODES_MAX   (512 * (1024 * 1024 * 1024 / PAGE_SIZE / 
16))
 #endif
 static void
 vntblinit(void *dummy __unused)
 {
+       int physvnodes, virtvnodes;
 
        /*
-        * Desiredvnodes is a function of the physical memory size and
-        * the kernel's heap size.  Specifically, desiredvnodes scales
-        * in proportion to the physical memory size until two fifths
-        * of the kernel's heap size is consumed by vnodes and vm
-        * objects.
+        * Desiredvnodes is a function of the physical memory size and the
+        * kernel's heap size.  Generally speaking, it scales with the
+        * physical memory size.  The ratio of desiredvnodes to physical 
pages
+        * is one to four until desiredvnodes exceeds 98,304.  
Thereafter, the
+        * marginal ratio of desiredvnodes to physical pages is one to
+        * sixteen.  However, desiredvnodes is limited by the kernel's heap
+        * size.  The memory required by desiredvnodes vnodes and vm objects
+        * may not exceed one seventh of the kernel's heap size.
         */
-       desiredvnodes = min(maxproc + cnt.v_page_count / 4, 2 * 
vm_kmem_size /
-           (5 * (sizeof(struct vm_object) + sizeof(struct vnode))));
+       physvnodes = maxproc + cnt.v_page_count / 16 + 3 * min(98304 * 4,
+           cnt.v_page_count) / 16;
+       virtvnodes = vm_kmem_size / (7 * (sizeof(struct vm_object) +
+           sizeof(struct vnode)));
+       desiredvnodes = min(physvnodes, virtvnodes);
        if (desiredvnodes > MAXVNODES_MAX) {
                if (bootverbose)
                        printf("Reducing kern.maxvnodes %d -> %d\n",