svn commit: r208589 - head/sys/mips/mips
Kostik Belousov
kostikbel at gmail.com
Mon Jun 7 20:46:37 UTC 2010
On Mon, Jun 07, 2010 at 01:49:36PM -0500, Alan Cox wrote:
> C. Jayachandran wrote:
> >On Mon, Jun 7, 2010 at 10:57 PM, Alan Cox <alc at cs.rice.edu> wrote:
> >
> >>C. Jayachandran wrote:
> >>
> >>>On Fri, Jun 4, 2010 at 10:44 PM, Alan Cox <alc at cs.rice.edu> wrote:
> >>>
> >>>
> >>>>C. Jayachandran wrote:
> >>>>
> >>>>
> >>>>>On Thu, Jun 3, 2010 at 10:33 PM, Alan Cox <alc at cs.rice.edu> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>[snip]
> >>>>
> >>>>
> >>>>>>Add a static counter to pmap_ptpgzone_allocf(). When the counter
> >>>>>>reaches
> >>>>>>some not-too-small prime number, don't call vm_phys_alloc_contig(),
> >>>>>>instead
> >>>>>>set "m" to NULL and reset the counter to zero. Setting "m" to NULL
> >>>>>>should
> >>>>>>then trigger the vm_contig_grow_cache() call. Make sense?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>Is this to move the vm_contig_grow_cache() out of the
> >>>>>pmap_ptpgzone_allocf() and into the function calling uma_zalloc()? I
> >>>>>am wondering why the prime number is needed.
> >>>>>
> >>>>>Another implementation I had thought about was to have a kernel
> >>>>>process maintain a pool of direct mapped pages for MIPS. This will be
> >>>>>woken up if the pool goes below a low water mark - any comments on
> >>>>>this?
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>Here is how the page table page allocation should be done. It's not
> >>>>much
> >>>>harder than the vm_contig_grow_cache() change.
> >>>>
> >>>>1. Look at vm/vm_phys.c, in particular, vm_phys_init(). Create a new
> >>>>freelist, like VM_FREELIST_ISADMA, for the direct-mapped memory on
> >>>>MIPS. Perhaps, call it VM_FREELIST_LOWMEM. The controlling macros
> >>>>should
> >>>>be defined in mips/include/vmparam.h.
> >>>>
> >>>>2. Trivially refactor vm_phys_alloc_pages(). Specifically, take the
> >>>>body
> >>>>of
> >>>>the outer for loop and make it a new function, vm_phys_alloc_freelist(),
> >>>> that takes the variable "flind" as a parameter.
> >>>>
> >>>>3. Eliminate the UMA zone for page table pages. In place of
> >>>>vm_phys_alloc_contig() call your new function with VM_FREELIST_LOWMEM.
> >>>> (The
> >>>>vm_contig_grow_cache() remains unchanged.) Go back to calling
> >>>>vm_page_free() to release page table pages.
> >>>>
> >>>>For the record, the changes that you made to start using a zone for page
> >>>>table page allocation introduced at least one possible race condition
> >>>>between pmap_remove_pages() and pmap_remove_all(). Specifically, by
> >>>>dropping and reacquiring the page queues and pmap lock when you free a
> >>>>page
> >>>>table page, you allow a pmap_remove_all() call while pmap_remove_pages()
> >>>>is
> >>>>running to leave the variable "npv" in pmap_remove_pages() pointing at a
> >>>>freed pv entry.
> >>>>
> >>>>
> >>>My first attempt is attached, it comes up multiuser but crashes if I
> >>>stress it a bit (panic: Bad link elm 0xc0078f00 prev->next != elm).
> >>>Will look at this tomorrow, and see if I can find the cause.
> >>>
> >>>
> >>>
> >>Where exactly is this occurring?
> >>
> >>
> >>>In the meantime, it may be worth looking at the patch to see if this
> >>>is in line with the approach you had suggested. I have tried to use
> >>>VM_FREELIST_HIGHMEM which is already there, instead of adding
> >>>VM_FREELIST_LOWMEM.
> >>>
> >>>
> >>>
> >>Basically, yes. At first, I was taken aback by defining
> >>VM_FREELIST_DEFAULT
> >>as other than zero, but it shouldn't break anything.
> >>
> >
> >I can change this to add VM_FREELIST_LOWMEM. This works okay so far,
> >even though I haven't really checked the code to see if it works when
> >there is no memory in HIGHMEM.
> >
> >
>
> I think that it will. It will just waste a little time looking at empty
> queues.
>
> >>The one issue that I see with the patch is that you'll need to release and
> >>reacquire the pmap and page queues locks around the call to
> >>vm_contig_grow_cache(). However, if you release these locks, you need to
> >>restart the pmap_allocpte(), i.e., perform the "goto retry;", because
> >>while
> >>you slept another processor/thread might have already allocated and
> >>installed the page table page that you are trying to allocate into the
> >>page
> >>table.
> >>
> >>For what it's worth, this same race condition exists in the current code
> >>in
> >>subversion using an UMA zone, but not in the code that predates use of an
> >>UMA zone.
> >>
> >
> >I have been testing with an updated version of the patch - my -j128
> >buildworld works on this. The sys/vm side changes are simple enough,
> >but the he page table page allocation part is still not complete. The
> >version I have now is (with gmail whitespace damage):
> >
> >---
> >static vm_page_t
> >pmap_alloc_pte_page(pmap_t pmap, unsigned int index, int wait, vm_offset_t
> >*vap)
> >{
> > vm_paddr_t paddr;
> > vm_page_t m;
> > int tries, flags;
> >
> > tries = 0;
> >retry:
> > mtx_lock(&vm_page_queue_free_mtx);
> > m = vm_phys_alloc_freelist_pages(VM_FREELIST_DEFAULT,
> > VM_FREEPOOL_DEFAULT, 0);
> > if (m == NULL) {
> > mtx_unlock(&vm_page_queue_free_mtx);
> > if (tries < ((wait & M_NOWAIT) != 0 ? 1 : 3)) {
> > vm_contig_grow_cache(tries, 0,
> > MIPS_KSEG0_LARGEST_PHYS);
> > tries++;
> > goto retry;
> > } else {
> > printf("[%d] %s wait %x, fail!\n", tries, __func__,
> > wait);
> > return (NULL);
> > }
> > }
> >
> > m->pindex = index;
> > m->valid = VM_PAGE_BITS_ALL;
> > atomic_add_int(&cnt.v_wire_count, 1);
> > m->wire_count = 1;
> > if (m->flags & PG_ZERO)
> > vm_page_zero_count--;
> > else
> > pmap_zero_page(m);
> > flags = PG_ZERO | PG_UNMANAGED;
> > m->flags = flags;
> > m->oflags = 0;
> > m->act_count = 0;
> > if ((m->flags & PG_CACHED) != 0)
> > printf("Not yet! handle cached page %p flags 0x%x\n",
> >m, m->flags);
> > else
> > cnt.v_free_count--;
> >
> > m->act_count = 0;
> > mtx_unlock(&vm_page_queue_free_mtx);
> >}
> >
> >
> >----
> >
> >I tried to match the vm_page_alloc() code here. I'm not sure if I have
> >to handle the PG_CACHED case, and if it is okay to change the VM
> >counters here.
> >
> >
>
> You do. In the end, I suspect that this function will be added to the
> machine-independent layer.
>
> >Also, I will change the code to return NULL all the way and retry
> >allocation (similar to the old code which called VM_WAIT and then
> >returned NULL).
> >
> >Probably I should get the whole code in line with the old code,
> >replacing the old vm_page_alloc() call with the function above without
> >retry, and the VM_WAIT with a new function which does the
> >vm_contig_grow_cache().
> >
> >
>
> You may find it easier if you move the call to vm_contig_grow_cache() to
> pmap_allocpte().
Selecting a random message in the thread to ask my question.
Is the issue that page table pages should be allocated from the specific
physical region of the memory ? If yes, doesn't i386 PAE has similar
issue with page directory pointer table ? I see a KASSERT in i386
pmap that verifies that the allocated table is below 4G, but I do not
understand how uma ensures the constraint (I suspect that it does not).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-mips/attachments/20100607/21f841d3/attachment.pgp
More information about the freebsd-mips
mailing list