Performance of SheevaPlug on 8-stable
Bernd Walter
ticso at cicely7.cicely.de
Mon Mar 8 19:55:04 UTC 2010
On Mon, Mar 08, 2010 at 01:37:23PM -0600, Mark Tinguely wrote:
>
> <deleted>
> >
> > This puzzled me as well.
> > What is the requirement for such a handling with shared pages?
> > I though handing over shared data is done by cache-flush, barriers or
> > whatever an architectur has for this.
> > Most systems we talk about are single CPU, so it is just DMA and
> > handing over dcache writes to icache, but we don't support self
> > modifying code, so it is always done in a controlled way.
> > And even for SMP systems handing over data requires using
> > cache coherence mechanisms - e.g. those embedded in mutexes.
> > So what is wrong in my picture and requires us to do special handling
> > for shared pages on ARM?
> >
> > > And if there's only one copy of 'test' running, why does it hit the
> > > 'shared' case for this code?
> > >
> > > Warner
> >
> > --
> > B.Walter <bernd at bwct.de> http://www.bwct.de
> > Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm.
>
> ARMv4/ARMv5 use virtual indexed / virtual tagged level one caches.
> They may or may not have level two caches. This is the ARM chips
> that we currently support, and I will explain the rules below.
>
> Newest processors the ARMv6 can be virtual index / physical tagged or
> physical index / physical tagged level one caches; The ARM7 must have
> physical index / physical tag level one caches. The ARMv6 and ARMv7
> have more pde/pte bit explaining the cache status on the "inner"
> and "outter" caches. The ARMv7 has the more mature cache management;
> it defines the "level of unity" and "level of coherence" for the caches.
> There is also a level snooping for the ARMv7 mulit-core, that I will
> just dance around. PIPT cache must be synced to the "level of coherency"
> before DMA and when modified from another process - think debugger in
> another address space modifying instruction code. ARMv6/ARMv7 have
> special address spaces to avoid tlb flushes. If they are not used, then
> tlbs have to be flushed on context switch. This is close to the i386/amd64
> with the exception of DMA, the i386/amd64 have self snooping cache buses.
>
> VIVT cache rules:
>
> 1) flush cache and tlb on context change.
>
> 2) USER cache must be disabled if a physical page has AT LEAST one writable
> user mapping AND is also mapped more than one time in the same user
> address space. (multiple read mappings and no writes are fine, they take
> up multiple cache entries. Obviously, a single read or a single write
> is fine. If the mappings are in different user address spaces, we will
> be okay because the flush on context change will sync things up).
>
> 3) KERNEL spaces are global.
> a) If the page is mapped writable AT LEAST ONCE to a kernel space
> AND the page is mapped more than once, no matter if the second
> mapping is in the user or kernel space, all mappings must not
> be cached.
I never assumed to be happy without a direct map.
> b) If the page has only readable kernel mappings but at least one
> writable user mapping, the cache must be disabled for the mappings
> of page in this address space. This is slightly different from
> rule 2. Kernel mappings are typically writable, so this is a
> case that really does not happen.
>
> It gets a little tricky to implement, because we have to catch the transition
> from cache -> non-cache (change pte and wbinv/inv data or instruction caches)
> and from non-cache -> cache (change the pte).
Thanks for the detailed explanation.
I took a while, but now I got it.
My picture wasn't expecting caching virtual pages.
--
B.Walter <bernd at bwct.de> http://www.bwct.de
Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm.
More information about the freebsd-arm
mailing list