svn commit: r253877 - in projects/atomic64/sys: amd64/include i386/include
Alan Cox
alc at rice.edu
Fri Aug 2 15:42:19 UTC 2013
On Aug 2, 2013, at 6:51 AM, Bruce Evans wrote:
> On Fri, 2 Aug 2013, Jung-uk Kim wrote:
>
>> Log:
>> Reimplement atomic operations on PDEs and PTEs in pmap.h. This change
>> significantly reduces duplicate code. Also, it may improve and even correct
>> some questionable implementations.
>
> Do they all (or any) need to be atomic with respect to multiple CPUs?
> It's hard to see how concurrent accesses to page tables can work worh
> without higher-level locking than is provided by atomic ops.
>
Some do, so that we do not lose a PG_M ("dirty") bit being set concurrently by another processor. However. none of these accesses need to be labeled as acquires or releases.
>> Modified: projects/atomic64/sys/amd64/include/pmap.h
>> ==============================================================================
>> --- projects/atomic64/sys/amd64/include/pmap.h Fri Aug 2 00:08:00 2013 (r253876)
>> +++ projects/atomic64/sys/amd64/include/pmap.h Fri Aug 2 00:20:04 2013 (r253877)
>> @@ -185,41 +185,13 @@ extern u_int64_t KPML4phys; /* physical
>> pt_entry_t *vtopte(vm_offset_t);
>> #define vtophys(va) pmap_kextract(((vm_offset_t) (va)))
>>
>> -static __inline pt_entry_t
>> -pte_load(pt_entry_t *ptep)
>> -{
>> - pt_entry_t r;
>> -
>> - r = *ptep;
>> - return (r);
>> -}
>
> This function wasn't atomic with respect to multiple CPUs. Except on
> i386 with PAE, but then it changes a 64-bit object on a 32-bit CPU,
> so it needs some locking just to be atomic with respect to a single CPU.
>
>> -static __inline pt_entry_t
>> -pte_load_store(pt_entry_t *ptep, pt_entry_t pte)
>> -{
>> - pt_entry_t r;
>> -
>> - __asm __volatile(
>> - "xchgq %0,%1"
>> - : "=m" (*ptep),
>> - "=r" (r)
>> - : "1" (pte),
>> - "m" (*ptep));
>> - return (r);
>> -}
>
> This was the main one that was atomic with respect to multiple CPUs on
> both amd64 and i386. This seems to be accidental -- xchg to memory gives
> a lock prefix and slowness whether you want it or not.
>
>> -
>> -#define pte_load_clear(pte) atomic_readandclear_long(pte)
>> -
>> -static __inline void
>> -pte_store(pt_entry_t *ptep, pt_entry_t pte)
>> -{
>> +#define pte_load(ptep) atomic_load_acq_long(ptep)
>> +#define pte_load_store(ptep, pte) atomic_swap_long(ptep, pte)
>> +#define pte_load_clear(pte) atomic_swap_long(pte, 0)
>> +#define pte_store(ptep, pte) atomic_store_rel_long(ptep, pte)
>> +#define pte_clear(ptep) atomic_store_rel_long(ptep, 0)
>>
>> - *ptep = pte;
>> -}
>
> pte_store() was also not atomic with respect to multiple CPUs. So almost
> everything was not atomic with respect to multiple CPUs, except for PAE
> on i386.
>
> Bruce
>
More information about the svn-src-projects
mailing list