(for archive) Re: Performance of SheevaPlug on 8-stable

Mark Tinguely marktinguely at gmail.com
Wed Nov 3 16:08:45 UTC 2010


Olivier Houchard wrote:
> On Mon, Mar 22, 2010 at 09:55:04AM -0500, Mark Tinguely wrote:
>   
>> On Mon, 08 Mar 2010 16:50:58, Grzegorz Bernacki said:
>>
>>     
>>>  This is probably caused by mechanism which turns of cache for shared pages.
>>>  When I add applied following path:
>>>
>>>  diff --git a/sys/arm/arm/pmap.c b/sys/arm/arm/pmap.c
>>>  index 390dc3c..d17c0cc 100644
>>>  --- a/sys/arm/arm/pmap.c
>>>  +++ b/sys/arm/arm/pmap.c
>>>  @@ -1401,6 +1401,8 @@ pmap_fix_cache(struct vm_page *pg, pmap_t pm, vm_offset_t va)
>>>            */
>>>
>>>           TAILQ_FOREACH(pv, &pg->md.pv_list, pv_list) {
>>>  +               if (pv->pv_flags & PVF_EXEC)
>>>  +                       return;
>>>                           /* generate a count of the pv_entry uses */
>>>                   if (pv->pv_flags & PVF_WRITE) {
>>>                           if (pv->pv_pmap == pmap_kernel())
>>>
>>>  execution time of 'test' program is:
>>>  mv78100-4# time ./test
>>>  5.000u 0.000s 0:05.40 99.8%     40+1324k 0+0io 0pf+0w
>>>
>>>  and without this path is:
>>>  mv78100-4# time ./test
>>>  295.000u 0.000s 4:56.01 99.7%   40+1322k 0+0io 0pf+0w
>>>
>>>
>>>  I think we need to handle executable pages in different way.
>>>
>>>  grzesiek
>>>       
>> Good going Oliver and thank-you on the pmap_enter_pv kernel map patch Revision
>> 205425.
>>
>> Last week, before this patch, Maks Verver was so kind to put some statements
>> in the VM (vm_page_free_toq()) for the SheevaPlug because I could not cause
>> these paths with the Gumstix emulator. Maks, could you add to
>> vm_phys_free_pages():
>>
>> 	if (m->md.pv_kva)
>> +	{
>> +		printf("vm_phys_free_pages: md.pv_kva 0x%08x\n", m->md.pv_kva);
>> 		m->md.pv_kva = 0;
>> +	}
>>
>> Even on the Gumstix emulator with the current patch, pmap_fix_cache() still
>> has many executable pages that have both a kernel and user pv_entry. Looks
>> like something like the above patch is still needed.
>>
>>     
>
> I added a few printf and saw the same thing, however isn't assuming we 
> shouldn't modify the cache settings if the page is executable a bit dangerous ?
> Or did I misread your patch ?
>
> Olivier
>
>   
I am cleaning up my old files. I thought I would put some numbers to 
this old thread into the archive in case this comes up again.

Warning: The bus dma sync code for VIVT caches will only perform cache 
sync operations on the calling virtual address. It works correctly 
because the I/O request maps the page to a KVA and uses this address to 
perform the I/O. If the page is becomes shared when mapped to the KVA, 
then pmap_fix_cache routine cleans out the cache for every current share 
mapping and disables the cache on the page table entry. We are talking 
about NOT performing the cache operations on a shared executable. I 
think this will not be a problem for reading an executable page for the 
follow reasons:
    0) I/O may cause a context change and flush the caches anyway. (very 
weak reason).
    1) We should not be changing the contents of the executable file 
while the program is active without performing caches operations.
    2) The executable page has been previously accessed and leaving part 
of the executable contents in the level 1 cache, then there should also 
be a RAM cached version in the buffer cache and we won't do DMA.
          ---
I put some counters into pmap_fix_cache and found that:

    About 1%  of the calls to pmap_fix_cache that has at least one 
executable mapping, also has one mapping that is not executable. In 
other words, the vast majority of executable pages have only executable 
mappings.

    About 89% of the pages that call pmap_fix_cache, also share the page 
with another pmap. It appears most are shared with another user pmap.

    About 4% of the pages that call pmap_fix_cache has at least one 
writable mapping.

    Of those pages that call pmap_fix_cache that have at least one 
writable mapping; About 80% of time, this is a user mapped write. IMO, 
user mapped writes must perform the cache operations.

I came up with a modification that ignores the cache fixing only for the 
kernel mapped writes. I don't know if this too conservative to do any 
good, but it should be safer than ignoring the cache maintenance 
routines for all executable mappings.

Index: arm/arm/pmap.c
===================================================================
--- arm/arm/pmap.c      (revision 214743)
+++ arm/arm/pmap.c      (working copy)
@@ -1309,6 +1309,7 @@
        int pmwc = 0;
        int writable = 0, kwritable = 0, uwritable = 0;
        int entries = 0, kentries = 0, uentries = 0;
+       int exec = 0, ncaches = 0;
        struct pv_entry *pv;

        mtx_assert(&vm_page_queue_mtx, MA_OWNED);
@@ -1335,7 +1336,19 @@
                                uentries++;
                        entries++;
                }
+               if (pv->pv_flags & PVF_EXEC)
+                       exec++;
+               if (pv->pv_flags & PVF_NC)
+                       ncaches++;
        }
+               /* don't cache if there is kernel mapping to otherwise
+                * read-only executable mappings.
+                */
+       if (exec && (exec == entries + kentries) &&
+           (writable == 0 || (writable == kwritable)) &&
+           (ncaches == 0))
+               return;
+
                /*
                 * check if the user duplicate mapping has
                 * been removed.

The conditions that I chose to bypass the cache operations are:
(line 1) pure executable mapped page.
(line 2) either not writable or only writable to the kernel map (cluster 
IO can make this >1)
(line 3) this is not a request to turn caches back on.

--Mark.

 


More information about the freebsd-arm mailing list