Question about adding flags to mmap system call / NVIDIA amd64 driver implementation

Julian Bangert julidaoc at online.de
Tue Apr 28 20:32:13 UTC 2009


Hello,

I am currently trying to work a bit on the remaining "missing feature"  
that NVIDIA requires ( http://wiki.freebsd.org/NvidiaFeatureRequests  or a  
back post in this ML) -  the improved mmap system call.
  For now, I am trying to extend the current system call and implementation  
to add cache control ( the type of memory caching used) . This feature  
inherently is very architecture specific- but it can lead to enormous  
performance improvements for memmapped devices ( useful for drivers, etc).  
I would do this at the user site by adding 3 flags to the mmap system call  
(MEM_CACHE__ATTR1 to MEM_CACHE__ATTR3 ) which are a single octal digit  
corresponding to the various caching options ( like Uncacheable,Write  
Combining, etc... ) with the same numbers as the PAT_* macros from  
i386/include/specialreg.h except that the value 0 ( PAT_UNCACHEABLE ) is  
replaced with value 2 ( undefined), whereas value 0 ( all 3 flags cleared)  
is assigned the meaning "feature not used, use default cache control".
For each cache behaviour there would of course also be a macro expanding  
to the rigth combination of these flags for enhanced useability.

  The mmap system call would, if any of these flags are set, decode them  
and get a corresponding PAT_* value, perform the mapping and then call  
into the pmap module to modify the cache attributes for every page.

  My first question is if there is a more elegant way of solving that - the  
3 flags would be architecture specific ( they could be used for other  
things on other architectures though if need be ) and I do not know the  
policy on architecture specific syscall flags, therefore I appreciate any  
input.

The second question goes to all those great VM/pmap gurus out there: As  
far as I understand, at the moment the pmap_change_attr can only cange the  
cache flags for kernel pages. Is there a particular reason why this  
function might not be adapted/extended to userspace mappings? If not, I  
would either add a new function to iterate over all pages and set cache  
flags for a particular region or add a new member (possibly just add the 3  
flags again ? ) to the md part of vm_page_t. Or one could just keep track  
and return errors as soon as someone tries to map a memory region (  
cache-customized mapping is usually done to device memory ) already mapped  
with  different cache behaviour.

I thank you for your assistance & happy coding,

--Julian Bangert



More information about the freebsd-hackers mailing list