svn commit: r341103 - head/sys/powerpc/include

Bruce Evans brde at optusnet.com.au
Mon Dec 3 20:58:18 UTC 2018


On Mon, 3 Dec 2018, Ian Lepore wrote:

> On Tue, 2018-12-04 at 05:56 +1100, Bruce Evans wrote:
>> On Mon, 3 Dec 2018, Justin Hibbits wrote:
>>
>> [...]
>>
>> Please look at removing VM_KMEM_SIZE_SCALE completely.  I'm now trying to
>> convince kib that it is bogus for all arches, but only know exactly what
>> happens on x86.
> ...
> I know we had problems with the default scaling on armv7 at $work when
> we tried to embed a large (150mb) mdrootfs into our kernel for a system
> with 2gb ram. I had to chase down the meaning of the scale variable
> (and I certainly could have misunderstood it to any degree), but here's
> what I wrote about it after fiddling and finding a value that worked
> for us. This was for early incarnations of 11-stable.

i386 can now fit a 2GB malloc-backed disk in 2.7GB of RAM and its 4GB kva.
This is sort of the opposite packing (allocate the md disk later).  This
requires a few more tweaks:
- change the scale to 1.  The bogus scale of 3 restricts kmem to 2.7GB/3
   = 900MB
- change vm.kmem_size to a few hundred MB above the desired md disk size.
   vm_kmem_size defaults to 1.7GB (was 420MB with 1GB kva).  This leaves
   almost 2.3GB for non-kmem allocations, only 300-400MB is needed.  I
   change some of these allocations, but the more interesting ones are
   in kmem.

> # Tuning required to make the kernel work with a large
> # embedded filesystem...
>> # Allocate one page of kmem_arena KVA for every
> # VM_KMEM_SIZE_SCALE pages of ram.  The default scale is 3,
> # and with a huge (>100MB) embedded mdroot that doesn't leave
> # enough virtual address space to allocate enough kernel
> # stacks, mbufs, and other resources that come out of KVA.
> options 	VM_KMEM_SIZE_SCALE=5

You should probably use vm.kmem_size for this (VM_KMEM_SIZE_SCALE=5 can be
done using a tunable too).  But this is a hack.  The large md allocation
shows that some of resource calculations are wrong.  kmem uses
vm_cnt.v_page_count (pages) for the main resource size in most cases.
The page count is variable and has already been reduced, but the kva limits
are constants.  150MB is not very large compared with the memory size, so
the reduction in the default kmem size based on page count is not large.
The default kmem size might be larger than VM_KMEM_SIZE_MAX, as on i386 with
1GB kva.  Then the default is actually VM_KMEM_SIZE_MAX, but this is far too
large after the md disk steals 150MB of kva.

My version was originally to fix a related problem with PAE for i386.
The page tables can be very large (448MB for 16GB RAM just for the
main page table metadata).  When the kva size is 1GB, not accounting
for this throws all allocation sizes off by a factor of 2, and it is
hard to fit everything in the remaining ~512MB even with non-sloppy
calculations.  When the kva size is 4GB as in -current, the error
factor is closer to 1, so sloppy calculations have a chance of working.
The md disk needs the same treatment.  I think it can only be embedded
in the kernel text+data+bss+etc.  That is easy to handle, but I didn't
notice the problem since my kernels are relatively small.

Bruce


More information about the svn-src-head mailing list