superpages for UMA

Alan Cox alc at rice.edu
Mon Aug 18 20:39:23 UTC 2014


On 08/18/2014 15:13, Peter Grehan wrote:
>> Newer Intel CPUs have more entries, and AMD CPUs have long (since
>> Barcelona) had more.  In particular, they allow 2 MB page mappings to be
>> cached in a larger L2 TLB.  Nowadays, the trouble is with the 1 GB
>> pages.
>> A lot of CPUs still only support an 8 entry, 1 level TLB for 1 GB pages.
>
>  There are new(ish) ones effectively without 1GB pages. From the
> "Software Optimization Guide for AMD Family 16h Processors"
>


My recollection is that the first Intel processors to support 1 GB page
mappings did this.  They allowed you set PG_PS on the 1GB PTE, but there
were no actual 1 GB page TLB entries.

Also, after I modified the direct map on amd64 to use 1 GB pages, I
noticed some strange performance anomalies.  Specifically, sometimes
performance was worse than I expected.  It turned out that when the end
of DRAM wasn't aligned to a 1 GB boundary, and the end of DRAM was
mapped with a 1 GB PTE, the TLB would wind up with 4 KB mappings for
anything covered by that last PTE.  Whereas, before, it was at least 2
MB aligned and we would wind up with 2 MB page mappings in the TLB.  So,
now, the direct creation has an awareness of this issue.


> "Smashing"
>   ...
> "when the Family 16h processor encounters a 1-Gbyte page size, it will
> smash translations of that 1-Gbyte region into 2-Mbyte TLB entries, each
> of which translates a 2-Mbyte region of the 1-Gbyte page."
>
> later,
>
> Peter.
>



More information about the freebsd-arch mailing list