Question about adding flags to mmap system call / NVIDIA amd64
toasty at dragondata.com
Tue Apr 28 22:06:09 UTC 2009
On Apr 28, 2009, at 3:19 PM, Julian Bangert wrote:
> I am currently trying to work a bit on the remaining "missing
> feature" that NVIDIA requires ( http://wiki.freebsd.org/NvidiaFeatureRequests
> or a back post in this ML) - the improved mmap system call.
> For now, I am trying to extend the current system call and
> implementation to add cache control ( the type of memory caching
> used) . This feature inherently is very architecture specific- but
> it can lead to enormous performance improvements for memmapped
> devices ( useful for drivers, etc). I would do this at the user site
> by adding 3 flags to the mmap system call (MEM_CACHE__ATTR1 to
> MEM_CACHE__ATTR3 ) which are a single octal digit corresponding to
> the various caching options ( like Uncacheable,Write Combining,
> etc... ) with the same numbers as the PAT_* macros from i386/include/
> specialreg.h except that the value 0 ( PAT_UNCACHEABLE ) is replaced
> with value 2 ( undefined), whereas value 0 ( all 3 flags cleared) is
> assigned the meaning "feature not used, use default cache control".
> For each cache behaviour there would of course also be a macro
> expanding to the rigth combination of these flags for enhanced
> The mmap system call would, if any of these flags are set, decode
> them and get a corresponding PAT_* value, perform the mapping and
> then call into the pmap module to modify the cache attributes for
> every page.
Have you looked at mem(4) yet?
Several architectures allow attributes to be associated with
physical memory. These attributes can be manipulated via
performed on /dev/mem. Declarations and data types are to be
The specific attributes, and number of programmable ranges may
between architectures. The full set of supported attributes is:
The region is not cached.
Writes to the region may be combined or performed out of
Writes to the region are committed synchronously.
Writes to the region are committed asynchronously.
The region cannot be written to.
This requires knowledge of the physical addresses, but I believe
that's probably already necessary for what it sounds like you're
trying to accomplish.
Back in the FreeBSD-3.0 days, I was writing a custom driver for an AGP
graphics controller, and setting the MTRR flags for the exposed buffer
was a definite improvement (200-1200% faster in most cases).
More information about the freebsd-hackers