Large virtual page size support.
Scott Long
scottl at samsco.org
Sun Jan 22 21:01:28 PST 2006
Jeff Roberson wrote:
> On Fri, 20 Jan 2006, Julian Elischer wrote:
>
>> Alan Cox wrote:
>>
>>> On Tue, Jan 17, 2006 at 11:18:56PM +0100, Poul-Henning Kamp wrote:
>>>
>>>> In message <43CD612E.2060002 at elischer.org>, Julian Elischer writes:
>>>>
>>>>> Jeff Roberson wrote:
>>>>>
>>>>>
>>>>>> I have implemented support in the vm for PAGE_SIZE values which
>>>>>> are a multiple of the hardware page size. This is primarily
>>>>>> useful for two things:
>>>>>>
>>>>> Mach (and the VM system we inherrited from it) had this. I beieve
>>>>> it was removed with teh comment
>>>>> "If we need this and someone is willing to support it it can be
>>>>> added back" .
>>>>>
>>>> It was a VAX artifact and not very usable. I belive we have a couple
>>>> of comments and macros which still talk about "clicks".
>>>>
>>>
>>> Like Jeff's patch, Mach's VM design allowed for two distinct page
>>> sizes, one being the native, hardware page size and the other being a
>>> larger, abstract page size. The essential difference between Jeff's
>>> patch and what Mach did on the VAX is that Mach's use of the native,
>>> hardware page size was entirely within the pmap and locore-level code.
>>> For example, the hardware-supported page size on the VAX was 512
>>> bytes. However, as far as the machine-independent layer of the Mach
>>> kernel was concerned the page size was 4K bytes. This included the
>>> machine-independent part of the virtual memory system; it too believed
>>> that the page size was 4K bytes. As a consequences, the granularity
>>> of mappings and protection was 4K bytes. Finally, there was nothing
>>> VAX-specific about the design and implementation of this feature.
>>> However, I don't recall any other pmap implementations having
>>> different native and abstract page sizes. Today, I speculate that you
>>> could implement a distinct native and abstract page size on the sparc
>>> because different versions of processor have had different page sizes.
>>> Consequently, the ABI documents that I've seen don't specify a
>>> particular page size only that 64K bytes is the largest that a page
>>> will ever be; to learn the precise page size, they say that you must
>>> call the OS at run time. So, you could use a larger abstract page
>>> without breaking the ABI.
>>>
>>> In constrast, Jeff's patch has both the machine-dependent and
>>> machine-independent layers knowing about both page sizes. Moreover,
>>> the granularity of mappings and protection is still the native,
>>> hardware page size. In other words, within the vm_map the page size
>>> is the native, hardware page size, but over in the vm_object the page
>>> size is the larger, abstract size. (Reread the last sentence again
>>> before continuing.) As you can imagine, this is a lot trickier to get
>>> right in the first place and maintain in the long run than what Mach
>>> did. This is why Jeff is being so circumspect about committing this
>>> work. Other the hand, it offers essentially the same benefits as what
>>> Mach did without breaking the i386 ABI.
>>>
>>
>> was this the reason that it was done in a different way?
>> What was the reason to not do it entirely in the pmap layer (e.g. Mach).
>> I know hte Maxh people were very proud of their implementation. It
>> always appeared in their technical descriptions.
>>
>> The phrase "this is a lot trickier to [...] maintain in the long run"
>> worries me.. There must be a reason to not go with the simpler
>> approach..
>> What was it?
>
>
> It doesn't maintain backwards compatibility. I originally implemented
> it in the mach way, but you have to recompile the entire system with the
> larger page size. This patch grew the MI parts to support existing
> binaries.
>
> It is complex. I was hoping for someone to chime in and say "That's
> great, we need that" or "No, that's not useful at all". Unfortunately,
> the response is somewhere in the middle. I guess the best course is to
> port it forward and test it on some x86 machines and see if it makes a
> big difference.
>
> Cheers,
> Jeff
>
Yes, we need that. Please commit =-)
Scott
More information about the freebsd-arch
mailing list