Question about adding flags to mmap system call / NVIDIA amd64 driver implementation

Kevin Day toasty at dragondata.com
Tue Apr 28 22:06:09 UTC 2009


On Apr 28, 2009, at 3:19 PM, Julian Bangert wrote:

> Hello,
>
> I am currently trying to work a bit on the remaining "missing  
> feature" that NVIDIA requires ( http://wiki.freebsd.org/NvidiaFeatureRequests 
>   or a back post in this ML) -  the improved mmap system call.
> For now, I am trying to extend the current system call and  
> implementation to add cache control ( the type of memory caching  
> used) . This feature inherently is very architecture specific- but  
> it can lead to enormous performance improvements for memmapped  
> devices ( useful for drivers, etc). I would do this at the user site  
> by adding 3 flags to the mmap system call (MEM_CACHE__ATTR1 to  
> MEM_CACHE__ATTR3 ) which are a single octal digit corresponding to  
> the various caching options ( like Uncacheable,Write Combining,  
> etc... ) with the same numbers as the PAT_* macros from i386/include/ 
> specialreg.h except that the value 0 ( PAT_UNCACHEABLE ) is replaced  
> with value 2 ( undefined), whereas value 0 ( all 3 flags cleared) is  
> assigned the meaning "feature not used, use default cache control".
> For each cache behaviour there would of course also be a macro  
> expanding to the rigth combination of these flags for enhanced  
> useability.
>
> The mmap system call would, if any of these flags are set, decode  
> them and get a corresponding PAT_* value, perform the mapping and  
> then call into the pmap module to modify the cache attributes for  
> every page.

Have you looked at mem(4) yet?

      Several architectures allow attributes to be associated with  
ranges of
      physical memory.  These attributes can be manipulated via  
ioctl() calls
      performed on /dev/mem.  Declarations and data types are to be  
found in
      <sys/memrange.h>.

      The specific attributes, and number of programmable ranges may  
vary
      between architectures.  The full set of supported attributes is:

      MDF_UNCACHEABLE
              The region is not cached.

      MDF_WRITECOMBINE
              Writes to the region may be combined or performed out of  
order.

      MDF_WRITETHROUGH
              Writes to the region are committed synchronously.

      MDF_WRITEBACK
              Writes to the region are committed asynchronously.

      MDF_WRITEPROTECT
              The region cannot be written to.

This requires knowledge of the physical addresses, but I believe  
that's probably already necessary for what it sounds like you're  
trying to accomplish.

Back in the FreeBSD-3.0 days, I was writing a custom driver for an AGP  
graphics controller, and setting the MTRR flags for the exposed buffer  
was a definite improvement (200-1200% faster in most cases).

-- Kevin



More information about the freebsd-hackers mailing list