svn commit: r243631 - in head/sys: kern sys
Alan Cox
alc at rice.edu
Sun Jan 13 10:10:17 UTC 2013
On 01/07/2013 12:47, Oleksandr Tymoshenko wrote:
> On 12/27/2012 6:46 PM, Oleksandr Tymoshenko wrote:
>> On 12/18/2012 1:59 AM, Alan Cox wrote:
>>> On 12/17/2012 23:40, Oleksandr Tymoshenko wrote:
>>>> On 2012-12-08, at 1:21 PM, Alan Cox <alc at rice.edu> wrote:
>>>>
>>>>> On 12/08/2012 14:32, Andre Oppermann wrote:
>>>> .. skipped ..
>>>>
>>>>>> The trouble seems to come from NSFBUFS which is (512 + maxusers *
>>>>>> 16)
>>>>>> resulting in a kernel map of (512 + 400 * 16) * PAGE_SIZE =
>>>>>> 27MB. This
>>>>>> seem to be pushing it with the smaller ARM kmap layout.
>>>>>>
>>>>>> Does it boot and run when you set the tunable kern.ipc.nsfbufs=3500?
>>>>>>
>>>>>> ARM does have a direct map mode as well which doesn't require the
>>>>>> allocation
>>>>>> of sfbufs. I'm not sure which other problems that approach has.
>>>>>>
>>>>> Only a few (3?) platforms use it. It reduces the size of the user
>>>>> address space, and translation between physical addresses and
>>>>> direct map
>>>>> addresses is not computationally trivial as it is on other
>>>>> architectures, e.g., amd64, ia64. However, it does try to use large
>>>>> page mappings.
>>>>>
>>>>>
>>>>>> Hopefully alc@ (added to cc) can answer that and also why the
>>>>>> kmap of
>>>>>> 27MB
>>>>>> manages to wrench the ARM kernel.
>>>>>>
>>>>> Arm does not define caps on either the buffer map size (param.h)
>>>>> or the
>>>>> kmem map size (vmparam.h). It would probably make sense to copy
>>>>> these
>>>>> definitions from i386.
>>>> Adding caps didn't help. I did some digging and found out that
>>>> although address range
>>>> 0xc0000000 .. 0xffffffff is indeed valid for ARM in general actual
>>>> KVA space varies for
>>>> each specific hardware platform. This "real" KVA is defined by
>>>> <virtual_avail, virtual_end>
>>>> pair and ifI use them instead of <VM_MIN_KERNEL_ADDRESS,
>>>> VM_MAX_KERNEL_ADDRESS>
>>>> in init_param2 function my pandaboard successfully boots. Since
>>>> former pair is used for defining
>>>> kernel_map boundaries I believe it should be used for auto tuning
>>>> as well.
>>>
>>> That makes sense. However, "virtual_avail" isn't the start of the
>>> kernel address space. The kernel map always starts at
>>> VM_MIN_KERNEL_ADDRESS. (See kmem_init().) "virtual_avail" represents
>>> the next unallocated virtual address in the kernel address space at an
>>> early point in initialization. "virtual_avail" and "virtual_end"
>>> aren't
>>> used after that, or outside the VM system. Please use
>>> vm_map_min(kernel_map) and vm_map_max(kernel_map) instead.
>>
>> I checked: kernel_map is not available (NULL) at this point. So we
>> can't use it to
>> determine real KVA size. Closest thing we can get is
>> virtual_avail/virtual_end pair.
>>
>> Andre, could you approve attached patch for commit or suggest better
>> solution?
>
> Any update on this one? Can I proceed with commit?
>
Yes, I've now spent a little bit of time looking at this, and I don't
see why these calculations and tunable_mbinit() need to be performed
before the kernel map is initialized.
Let me summarize what I found:
1. The function tunable_mbinit() now has a dependency on the global
variable maxmbufmem. tunable_mbinit() is executed under
SI_SUB_TUNABLES. tunable_mbinit() defines the global variable
nmbclusters. The statements made in the comment at the head of
tunable_mbinit() all appear to be false:
/*
* tunable_mbinit() has to be run before init_maxsockets() thus
* the SYSINIT order below is SI_ORDER_MIDDLE while init_maxsockets()
* runs at SI_ORDER_ANY.
*
* NB: This has to be done before VM init.
*/
I don't see anything in init_maxsockets() that depends on
tunable_mbinit(). Moreover, the statement about "VM init" is only
correct if you regard the initialization of the kernel's malloc as "VM
init".
2. The function kmeminit() in kern/kern_malloc.c has a dependency on the
global variable nmbclusters. kmeminit() is executed under SI_SUB_KMEM,
which comes after the initialization of the virtual memory system,
including the kernel map.
3. The function vm_ksubmap_init() has a dependency on the global
variable maxpipekva. vm_ksubmap_init() is executed under SI_SUB_CPU,
which comes after SI_SUB_KMEM.
Am I missing anything?
I'm attaching a patch that defers the calculation of maxpipekva until we
actually need it in vm_ksubmap_init(). Any comments on this patch are
welcome.
Alan
-------------- next part --------------
Index: kern/subr_param.c
===================================================================
--- kern/subr_param.c (revision 245346)
+++ kern/subr_param.c (working copy)
@@ -97,7 +97,6 @@ quad_t maxmbufmem; /* max mbuf memory */
pid_t pid_max = PID_MAX;
long maxswzone; /* max swmeta KVA storage */
long maxbcache; /* max buffer cache KVA storage */
-long maxpipekva; /* Limit on pipe KVA */
int vm_guest; /* Running as virtual machine guest? */
u_long maxtsiz; /* max text size */
u_long dfldsiz; /* initial data size limit */
@@ -339,18 +338,6 @@ init_param2(long physpages)
TUNABLE_QUAD_FETCH("kern.maxmbufmem", &maxmbufmem);
if (maxmbufmem > (realmem / 4) * 3)
maxmbufmem = (realmem / 4) * 3;
-
- /*
- * The default for maxpipekva is min(1/64 of the kernel address space,
- * max(1/64 of main memory, 512KB)). See sys_pipe.c for more details.
- */
- maxpipekva = (physpages / 64) * PAGE_SIZE;
- TUNABLE_LONG_FETCH("kern.ipc.maxpipekva", &maxpipekva);
- if (maxpipekva < 512 * 1024)
- maxpipekva = 512 * 1024;
- if (maxpipekva > (VM_MAX_KERNEL_ADDRESS - VM_MIN_KERNEL_ADDRESS) / 64)
- maxpipekva = (VM_MAX_KERNEL_ADDRESS - VM_MIN_KERNEL_ADDRESS) /
- 64;
}
/*
Index: kern/sys_pipe.c
===================================================================
--- kern/sys_pipe.c (revision 245346)
+++ kern/sys_pipe.c (working copy)
@@ -207,6 +207,8 @@ static int pipeallocfail;
static int piperesizefail;
static int piperesizeallowed = 1;
+long maxpipekva;
+
SYSCTL_LONG(_kern_ipc, OID_AUTO, maxpipekva, CTLFLAG_RDTUN,
&maxpipekva, 0, "Pipe KVA limit");
SYSCTL_LONG(_kern_ipc, OID_AUTO, pipekva, CTLFLAG_RD,
Index: vm/vm_init.c
===================================================================
--- vm/vm_init.c (revision 245346)
+++ vm/vm_init.c (working copy)
@@ -132,12 +132,14 @@ vm_ksubmap_init(struct kva_md_info *kmi)
{
vm_offset_t firstaddr;
caddr_t v;
- vm_size_t size = 0;
+ vm_size_t kernel_map_size, size = 0;
long physmem_est;
vm_offset_t minaddr;
vm_offset_t maxaddr;
vm_map_t clean_map;
+ kernel_map_size = kernel_map->max_offset - kernel_map->min_offset;
+
/*
* Allocate space for system data structures.
* The first available kernel virtual address is in "v".
@@ -163,8 +165,7 @@ again:
* Discount the physical memory larger than the size of kernel_map
* to avoid eating up all of KVA space.
*/
- physmem_est = lmin(physmem, btoc(kernel_map->max_offset -
- kernel_map->min_offset));
+ physmem_est = lmin(physmem, btoc(kernel_map_size));
v = kern_vfs_bio_buffer_alloc(v, physmem_est);
@@ -195,6 +196,18 @@ again:
pager_map->system_map = 1;
exec_map = kmem_suballoc(kernel_map, &minaddr, &maxaddr,
exec_map_entries * round_page(PATH_MAX + ARG_MAX), FALSE);
+
+ /*
+ * The default size for the pipe submap, "maxpipekva", is min(1/64 of
+ * the kernel virtual address space, max(1/64 of the physical memory,
+ * 512KB)). See sys_pipe.c for more details.
+ */
+ maxpipekva = ctob(physmem / 64);
+ TUNABLE_LONG_FETCH("kern.ipc.maxpipekva", &maxpipekva);
+ if (maxpipekva < 512 * 1024)
+ maxpipekva = 512 * 1024;
+ if (maxpipekva > kernel_map_size / 64)
+ maxpipekva = kernel_map_size / 64;
pipe_map = kmem_suballoc(kernel_map, &minaddr, &maxaddr, maxpipekva,
FALSE);
More information about the svn-src-all
mailing list