svn commit: r243631 - in head/sys: kern sys

Alan Cox alc at rice.edu
Sun Jan 13 10:10:17 UTC 2013


On 01/07/2013 12:47, Oleksandr Tymoshenko wrote:
> On 12/27/2012 6:46 PM, Oleksandr Tymoshenko wrote:
>> On 12/18/2012 1:59 AM, Alan Cox wrote:
>>> On 12/17/2012 23:40, Oleksandr Tymoshenko wrote:
>>>> On 2012-12-08, at 1:21 PM, Alan Cox <alc at rice.edu> wrote:
>>>>
>>>>> On 12/08/2012 14:32, Andre Oppermann wrote:
>>>> .. skipped ..
>>>>
>>>>>> The trouble seems to come from NSFBUFS which is (512 + maxusers *
>>>>>> 16)
>>>>>> resulting in a kernel map of (512 + 400 * 16) * PAGE_SIZE =
>>>>>> 27MB.  This
>>>>>> seem to be pushing it with the smaller ARM kmap layout.
>>>>>>
>>>>>> Does it boot and run when you set the tunable kern.ipc.nsfbufs=3500?
>>>>>>
>>>>>> ARM does have a direct map mode as well which doesn't require the
>>>>>> allocation
>>>>>> of sfbufs.  I'm not sure which other problems that approach has.
>>>>>>
>>>>> Only a few (3?) platforms use it.  It reduces the size of the user
>>>>> address space, and translation between physical addresses and
>>>>> direct map
>>>>> addresses is not computationally trivial as it is on other
>>>>> architectures, e.g., amd64, ia64.  However, it does try to use large
>>>>> page mappings.
>>>>>
>>>>>
>>>>>> Hopefully alc@ (added to cc) can answer that and also why the
>>>>>> kmap of
>>>>>> 27MB
>>>>>> manages to wrench the ARM kernel.
>>>>>>
>>>>> Arm does not define caps on either the buffer map size (param.h)
>>>>> or the
>>>>> kmem map size (vmparam.h).  It would probably make sense to copy
>>>>> these
>>>>> definitions from i386.
>>>> Adding caps didn't help. I did some digging and found out that
>>>> although address range
>>>> 0xc0000000 .. 0xffffffff is indeed valid for ARM in general actual
>>>> KVA space varies for
>>>> each specific hardware platform. This "real" KVA is defined by
>>>> <virtual_avail, virtual_end>
>>>> pair and ifI use them instead of <VM_MIN_KERNEL_ADDRESS,
>>>> VM_MAX_KERNEL_ADDRESS>
>>>> in init_param2 function my pandaboard successfully boots. Since
>>>> former pair is used for defining
>>>> kernel_map boundaries I believe it should be used for auto tuning
>>>> as well.
>>>
>>> That makes sense.  However, "virtual_avail" isn't the start of the
>>> kernel address space.  The kernel map always starts at
>>> VM_MIN_KERNEL_ADDRESS.  (See kmem_init().)  "virtual_avail" represents
>>> the next unallocated virtual address in the kernel address space at an
>>> early point in initialization.  "virtual_avail" and "virtual_end"
>>> aren't
>>> used after that, or outside the VM system.  Please use
>>> vm_map_min(kernel_map) and vm_map_max(kernel_map) instead.
>>
>> I checked: kernel_map is not available (NULL) at this point.  So we
>> can't use it to
>> determine real KVA size. Closest thing we can get is
>> virtual_avail/virtual_end pair.
>>
>> Andre, could you approve attached patch for commit or suggest better
>> solution?
>
> Any update on this one? Can I proceed with commit?
>

Yes, I've now spent a little bit of time looking at this, and I don't
see why these calculations and tunable_mbinit() need to be performed
before the kernel map is initialized. 

Let me summarize what I found:

1. The function tunable_mbinit() now has a dependency on the global
variable maxmbufmem.  tunable_mbinit() is executed under
SI_SUB_TUNABLES.  tunable_mbinit() defines the global variable
nmbclusters.  The statements made in the comment at the head of
tunable_mbinit() all appear to be false:

/*
 * tunable_mbinit() has to be run before init_maxsockets() thus
 * the SYSINIT order below is SI_ORDER_MIDDLE while init_maxsockets()
 * runs at SI_ORDER_ANY.
 *
 * NB: This has to be done before VM init.
 */

I don't see anything in init_maxsockets() that depends on
tunable_mbinit().  Moreover, the statement about "VM init" is only
correct if you regard the initialization of the kernel's malloc as "VM
init".

2. The function kmeminit() in kern/kern_malloc.c has a dependency on the
global variable nmbclusters.  kmeminit() is executed under SI_SUB_KMEM,
which comes after the initialization of the virtual memory system,
including the kernel map.

3. The function vm_ksubmap_init() has a dependency on the global
variable maxpipekva.  vm_ksubmap_init() is executed under SI_SUB_CPU,
which comes after SI_SUB_KMEM.

Am I missing anything?

I'm attaching a patch that defers the calculation of maxpipekva until we
actually need it in vm_ksubmap_init().  Any comments on this patch are
welcome.

Alan

-------------- next part --------------
Index: kern/subr_param.c
===================================================================
--- kern/subr_param.c	(revision 245346)
+++ kern/subr_param.c	(working copy)
@@ -97,7 +97,6 @@ quad_t	maxmbufmem;			/* max mbuf memory */
 pid_t	pid_max = PID_MAX;
 long	maxswzone;			/* max swmeta KVA storage */
 long	maxbcache;			/* max buffer cache KVA storage */
-long	maxpipekva;			/* Limit on pipe KVA */
 int 	vm_guest;			/* Running as virtual machine guest? */
 u_long	maxtsiz;			/* max text size */
 u_long	dfldsiz;			/* initial data size limit */
@@ -339,18 +338,6 @@ init_param2(long physpages)
 	TUNABLE_QUAD_FETCH("kern.maxmbufmem", &maxmbufmem);
 	if (maxmbufmem > (realmem / 4) * 3)
 		maxmbufmem = (realmem / 4) * 3;
-
-	/*
-	 * The default for maxpipekva is min(1/64 of the kernel address space,
-	 * max(1/64 of main memory, 512KB)).  See sys_pipe.c for more details.
-	 */
-	maxpipekva = (physpages / 64) * PAGE_SIZE;
-	TUNABLE_LONG_FETCH("kern.ipc.maxpipekva", &maxpipekva);
-	if (maxpipekva < 512 * 1024)
-		maxpipekva = 512 * 1024;
-	if (maxpipekva > (VM_MAX_KERNEL_ADDRESS - VM_MIN_KERNEL_ADDRESS) / 64)
-		maxpipekva = (VM_MAX_KERNEL_ADDRESS - VM_MIN_KERNEL_ADDRESS) /
-		    64;
 }
 
 /*
Index: kern/sys_pipe.c
===================================================================
--- kern/sys_pipe.c	(revision 245346)
+++ kern/sys_pipe.c	(working copy)
@@ -207,6 +207,8 @@ static int pipeallocfail;
 static int piperesizefail;
 static int piperesizeallowed = 1;
 
+long maxpipekva;
+
 SYSCTL_LONG(_kern_ipc, OID_AUTO, maxpipekva, CTLFLAG_RDTUN,
 	   &maxpipekva, 0, "Pipe KVA limit");
 SYSCTL_LONG(_kern_ipc, OID_AUTO, pipekva, CTLFLAG_RD,
Index: vm/vm_init.c
===================================================================
--- vm/vm_init.c	(revision 245346)
+++ vm/vm_init.c	(working copy)
@@ -132,12 +132,14 @@ vm_ksubmap_init(struct kva_md_info *kmi)
 {
 	vm_offset_t firstaddr;
 	caddr_t v;
-	vm_size_t size = 0;
+	vm_size_t kernel_map_size, size = 0;
 	long physmem_est;
 	vm_offset_t minaddr;
 	vm_offset_t maxaddr;
 	vm_map_t clean_map;
 
+	kernel_map_size = kernel_map->max_offset - kernel_map->min_offset;
+
 	/*
 	 * Allocate space for system data structures.
 	 * The first available kernel virtual address is in "v".
@@ -163,8 +165,7 @@ again:
 	 * Discount the physical memory larger than the size of kernel_map
 	 * to avoid eating up all of KVA space.
 	 */
-	physmem_est = lmin(physmem, btoc(kernel_map->max_offset -
-	    kernel_map->min_offset));
+	physmem_est = lmin(physmem, btoc(kernel_map_size));
 
 	v = kern_vfs_bio_buffer_alloc(v, physmem_est);
 
@@ -195,6 +196,18 @@ again:
 	pager_map->system_map = 1;
 	exec_map = kmem_suballoc(kernel_map, &minaddr, &maxaddr,
 	    exec_map_entries * round_page(PATH_MAX + ARG_MAX), FALSE);
+
+	/*
+	 * The default size for the pipe submap, "maxpipekva", is min(1/64 of
+	 * the kernel virtual address space, max(1/64 of the physical memory,
+	 * 512KB)).  See sys_pipe.c for more details.
+	 */
+	maxpipekva = ctob(physmem / 64);
+	TUNABLE_LONG_FETCH("kern.ipc.maxpipekva", &maxpipekva);
+	if (maxpipekva < 512 * 1024)
+		maxpipekva = 512 * 1024;
+	if (maxpipekva > kernel_map_size / 64)
+		maxpipekva = kernel_map_size / 64;
 	pipe_map = kmem_suballoc(kernel_map, &minaddr, &maxaddr, maxpipekva,
 	    FALSE);
 


More information about the svn-src-all mailing list