Large virtual page size support.

Sun Jan 22 21:01:28 PST 2006

Jeff Roberson wrote:
> On Fri, 20 Jan 2006, Julian Elischer wrote:
> 
>> Alan Cox wrote:
>>
>>> On Tue, Jan 17, 2006 at 11:18:56PM +0100, Poul-Henning Kamp wrote:
>>>
>>>> In message <43CD612E.2060002 at elischer.org>, Julian Elischer writes:
>>>>
>>>>> Jeff Roberson wrote:
>>>>>
>>>>>
>>>>>> I have implemented support in the vm for PAGE_SIZE values which 
>>>>>> are a multiple of the hardware page size.  This is primarily 
>>>>>> useful for two things:
>>>>>>
>>>>> Mach (and the VM system we inherrited from it) had this. I beieve 
>>>>> it was removed with teh comment
>>>>> "If we need this and someone is willing to support it it can be 
>>>>> added back" .
>>>>>
>>>> It was a VAX artifact and not very usable.  I belive we have a couple
>>>> of comments and macros which still talk about "clicks".
>>>>
>>>
>>> Like Jeff's patch, Mach's VM design allowed for two distinct page
>>> sizes, one being the native, hardware page size and the other being a
>>> larger, abstract page size.  The essential difference between Jeff's
>>> patch and what Mach did on the VAX is that Mach's use of the native,
>>> hardware page size was entirely within the pmap and locore-level code.
>>> For example, the hardware-supported page size on the VAX was 512
>>> bytes.  However, as far as the machine-independent layer of the Mach
>>> kernel was concerned the page size was 4K bytes.  This included the
>>> machine-independent part of the virtual memory system; it too believed
>>> that the page size was 4K bytes.  As a consequences, the granularity
>>> of mappings and protection was 4K bytes.  Finally, there was nothing
>>> VAX-specific about the design and implementation of this feature.
>>> However, I don't recall any other pmap implementations having
>>> different native and abstract page sizes.  Today, I speculate that you
>>> could implement a distinct native and abstract page size on the sparc
>>> because different versions of processor have had different page sizes.
>>> Consequently, the ABI documents that I've seen don't specify a
>>> particular page size only that 64K bytes is the largest that a page
>>> will ever be; to learn the precise page size, they say that you must
>>> call the OS at run time.  So, you could use a larger abstract page
>>> without breaking the ABI.
>>>
>>> In constrast, Jeff's patch has both the machine-dependent and
>>> machine-independent layers knowing about both page sizes.  Moreover,
>>> the granularity of mappings and protection is still the native,
>>> hardware page size.  In other words, within the vm_map the page size
>>> is the native, hardware page size, but over in the vm_object the page
>>> size is the larger, abstract size.  (Reread the last sentence again
>>> before continuing.)  As you can imagine, this is a lot trickier to get
>>> right in the first place and maintain in the long run than what Mach
>>> did.  This is why Jeff is being so circumspect about committing this
>>> work.  Other the hand, it offers essentially the same benefits as what
>>> Mach did without breaking the i386 ABI.
>>>
>>
>> was this the reason that it was done in a different way?
>> What was the reason to not do it entirely in the pmap layer (e.g. Mach).
>> I know hte Maxh people were very proud of their implementation. It
>> always appeared in their technical descriptions.
>>
>> The phrase "this is a lot trickier to [...] maintain in the long run"
>> worries me..   There must be a reason to not go with the simpler 
>> approach..
>> What was it?
> 
> 
> It doesn't maintain backwards compatibility.  I originally implemented 
> it in the mach way, but you have to recompile the entire system with the 
> larger page size.  This patch grew the MI parts to support existing 
> binaries.
> 
> It is complex.  I was hoping for someone to chime in and say "That's 
> great, we need that" or "No, that's not useful at all".  Unfortunately, 
> the response is somewhere in the middle.  I guess the best course is to 
> port it forward and test it on some x86 machines and see if it makes a 
> big difference.
> 
> Cheers,
> Jeff
> 

Yes, we need that.  Please commit =-)

Scott