svn commit: r208589 - head/sys/mips/mips
Jayachandran C.
c.jayachandran at gmail.com
Tue Jun 8 21:31:14 UTC 2010
On Tue, Jun 8, 2010 at 12:03 PM, Alan Cox <alc at cs.rice.edu> wrote:
> C. Jayachandran wrote:
>>
>> On Tue, Jun 8, 2010 at 2:59 AM, Alan Cox <alc at cs.rice.edu> wrote:
>>
>>>
>>> On 6/7/2010 3:28 PM, Kostik Belousov wrote:
>>>
>>>>
>>>> Selecting a random message in the thread to ask my question.
>>>> Is the issue that page table pages should be allocated from the specific
>>>> physical region of the memory ? If yes, doesn't i386 PAE has similar
>>>> issue with page directory pointer table ? I see a KASSERT in i386
>>>> pmap that verifies that the allocated table is below 4G, but I do not
>>>> understand how uma ensures the constraint (I suspect that it does not).
>>>>
>>>>
>>>
>>> For i386 PAE, the UMA backend allocator uses kmem_alloc_contig() to
>>> ensure
>>> that the memory is below 4G. The crucial difference between i386 PAE and
>>> MIPS is that for i386 PAE only the top-level table needs to be below a
>>> specific address threshold. Moreover, this level is allocated in a
>>> place,
>>> pmap_pinit(), where we are allowed to sleep.
>>>
>>
>> Yes. I saw the PAE top level page table code and thought I could use
>> that mechanism for allocating MIPS page table pages in the direct
>> mapped memory. The other reference I used was
>> pmap_alloc_zeroed_contig_pages() function in sun4v/sun4v/pmap.c which
>> uses the vm_phys_alloc_contig() and VM_WAIT.
>
> That's unfortunate. :-( Since sun4v is essentially dead code, I've never
> spent much time thinking about its pmap implementation. I'll mechanically
> apply changes to it, but that's about it. I wouldn't recommend using it as
> a reference.
>
>> ... I had also thought of
>> using the VM_FREEPOOL_DIRECT which seemed to be for a similar purpose,
>> but could find see any usage in the kernel.
>>
>>
>
> VM_FREEPOOL_DIRECT is used by at least amd64 and ia64 for page table pages
> and small kernel memory allocations. Unlike mips, these machines don't have
> MMU support for a direct map. Their direct maps are just a range of
> mappings in the regular (kernel) page table. So, unlike mips, accesses
> through their direct map may still miss in the TLB and require a page table
> walk. VM_FREEPOOL_* is a way to increase the physical locality (or
> clustering) of page allocations, so that, for example, page table page
> accesses by the pmap on amd64 are less likely to miss in the TLB. However,
> it doesn't place a hard restriction on the range of physical addresses that
> will be used, which you need for mips.
>
> The impact of this clustering can be significant. For example, on amd64 we
> use 2MB page mappings to implement the direct map. However, old Opterons
> only had 8 data TLB entries for 2MB page mappings. For a uniprocessor
> kernel running on such an Opteron, I measured an 18% reduction in system
> time during a buildworld with the introduction of VM_FREEPOOL_DIRECT. (See
> the commit logs for vm/vm_phys.c and the comment that precedes the
> VM_NFREEORDER definition on amd64.)
>
> Until such time as superpage support is ported to mips from the amd64/i386
> pmaps, I don't think there is a point in having more than one VM_FREEPOOL_*
> on mips. And then, the point would be to reduce fragmentation of the
> physical memory that could be caused by small allocations, such as page
> table pages.
Thanks for the detailed explanation.
Also, after looking at the code again, I think vm_phys_alloc_contig()
can optimized not to look into segments which lie outside the area of
interest. The patch is:
Index: sys/vm/vm_phys.c
===================================================================
--- sys/vm/vm_phys.c (revision 208890)
+++ sys/vm/vm_phys.c (working copy)
@@ -595,7 +595,7 @@
vm_object_t m_object;
vm_paddr_t pa, pa_last, size;
vm_page_t deferred_vdrop_list, m, m_ret;
- int flind, i, oind, order, pind;
+ int segind, i, oind, order, pind;
size = npages << PAGE_SHIFT;
KASSERT(size != 0,
@@ -611,21 +611,20 @@
#if VM_NRESERVLEVEL > 0
retry:
#endif
- for (flind = 0; flind < vm_nfreelists; flind++) {
+ for (segind = 0; segind < vm_phys_nsegs; segind++) {
+ /*
+ * A free list may contain physical pages
+ * from one or more segments.
+ */
+ seg = &vm_phys_segs[segind];
+ if (seg->start > high || low >= seg->end)
+ continue;
+
for (oind = min(order, VM_NFREEORDER - 1); oind <
VM_NFREEORDER; oind++) {
for (pind = 0; pind < VM_NFREEPOOL; pind++) {
- fl = vm_phys_free_queues[flind][pind];
+ fl = (*seg->free_queues)[pind];
TAILQ_FOREACH(m_ret, &fl[oind].pl, pageq) {
/*
- * A free list may contain
physical pages
- * from one or more segments.
- */
- seg = &vm_phys_segs[m_ret->segind];
- if (seg->start > high ||
- low >= seg->end)
- continue;
-
- /*
* Is the size of this
allocation request
* larger than the largest block size?
*/
-----
This change, along with the vmparam.h changes for HIGHMEM, I think we
should be able to use vm_phys_alloc_contig() for page table pages (or
have I again missed something fundamental?).
JC.
More information about the freebsd-mips
mailing list