Question about adding flags to mmap system call / NVIDIA amd64
driver implementation
John Baldwin
jhb at freebsd.org
Thu Apr 30 21:41:32 UTC 2009
On Tuesday 28 April 2009 7:58:57 pm Julian Elischer wrote:
> Robert Noland wrote:
> > On Tue, 2009-04-28 at 16:48 -0500, Kevin Day wrote:
> >> On Apr 28, 2009, at 3:19 PM, Julian Bangert wrote:
> >>
> >>> Hello,
> >>>
> >>> I am currently trying to work a bit on the remaining "missing
> >>> feature" that NVIDIA requires (
http://wiki.freebsd.org/NvidiaFeatureRequests
> >>> or a back post in this ML) - the improved mmap system call.
>
>
> you might check with jhb (john Baldwin) as I think (from his
> p4 work) that he may be doing something in this area in p4.
After some promptings from Robert and his needs for Xorg recently I did start
hacking on this again. However, I haven't tested it yet. What I have done
so far is in //depot/user/jhb/pat/... and it does the following:
1) Adds a vm_cache_mode_t. Each arch defines the valid values for this (I've
only done the MD portions of this work for amd64 so far). Every arch must at
least define a value for VM_CACHE_DEFAULT.
2) Stores a cache mode in each vm_map_entry struct. This cache mode is then
passed down to a few pmap functions: pmap_object_init_pt(),
pmap_enter_object(), and pmap_enter_quick(). Several vm_map routines such as
vm_map_insert() and vm_map_find() now take a cache mode to use when adding a
new mapping.
3) Each VM object stores a cache mode as well (defaults to VM_CACHE_DEFAULT).
When a VM_CACHE_DEFAULT mapping is made of an object, the cache mode of the
object is used.
4) A new VM object type: OBJT_SG. This object type has its own pager that is
sort of like the device pager. However, instead of invoking d_mmap() to
determine the physaddr for a given page, it consults a pre-created
scatter/gather list (an ADT from my branch for working on unmapped buffer
I/O) to determine the backing physical address for a given virtual address.
5) A new callback for device mmap: d_mmap_single(). One of the features of
this is that it can return a vm_object_t to be used to satisfy the mmap()
request instead of using the device's device pager VM object.
6) A new mcache() system call similar to mprotect(), except that it changes
the cache mode of an address range rather than the protection. This may not
be all that useful really.
Given all this, a driver could do the following to map a "thing" as WC in both
userland and the kernel:
1) When it learns about a "thing" it creates a SG list to describe it. If
the "thing" consists of userland pages, it has to wire the pages first. The
driver can use vm_allocate_pager() to create a OBJT_SG VM object. It sets
the object's cache mode to VM_CACHE_WC (if the arch supports that).
2) When userland wants to map the "thing" it does a device mmap() with a
proper length and a file offset that is a cookie for the "thing". The device
driver's d_mmap_single() recognizes the magic file offset and returns
the "thing"'s VM object. Since the mapping info is now part of a normal
object mapping, it will go away via munmap(), etc. The driver no longer has
to do weird gymnastics to invalidate mappings from its device pager
as "transient" mappings are no longer stored in the device pager.
3) When the driver wants to map the "thing" into the kernel, it can use
vm_map_find() to insert the "thing"'s VM object into kernel map.
And I think that is all there is to it. I need to test this somehow to make
sure though, and make sure this meets the needs of Robert and Nvidia.
--
John Baldwin
More information about the freebsd-hackers
mailing list