bus_dmamap_sync() for bounced client buffers from user address space

Jason Harmening jason.harmening at gmail.com
Tue Apr 28 14:47:50 UTC 2015


On 04/28/15 08:40, John Baldwin wrote:
> On Saturday, April 25, 2015 07:34:44 PM Konstantin Belousov wrote:
>> On Sat, Apr 25, 2015 at 09:02:12AM -0500, Jason Harmening wrote:
>>> It seems like in general it is too hard for drivers using busdma to deal
>>> with usermode memory in a way that's both safe and efficient:
>>> --bus_dmamap_load_uio + UIO_USERSPACE is apparently really unsafe
>>> --if they do things the other way and allocate in the kernel, then then
>>> they better either be willing to do extra copying, or create and
>>> refcount their own vm_objects and use d_mmap_single (I still haven't
>>> seen a good example of that), or leak a bunch of memory (if they use
>>> d_mmap), because the old device pager is also really unsafe.
>> munmap(2) does not free the pages, it removes the mapping and dereferences
>> the backing vm object.  If the region was wired, munmap would decrement
>> the wiring count for the pages.  So if a kernel code wired the regions
>> pages, they are kept wired, but no longer mapped into the userspace.
>> So bcopy() still does not work.
>>
>> d_mmap_single() is used by GPU, definitely by GEM and TTM code, and possibly
>> by the proprietary nvidia driver.
> Yes, the nvidia driver uses it.  I've also used it for some proprietary
> driver extensions.

I've seen d_mmap_single() used in the GPU code, but I haven't seen it
used in conjunction with busdma (but maybe not looking in the right place).


>
>> I believe UIO_USERSPACE is almost unused, it might be there for some
>> obscure (and buggy) driver.
> I believe it was added (and only ever used) in crypto drivers, and that they
> all did bus_dma operations in the context of the thread that passed in the
> uio.  I definitely think it is fragile and should be replaced with something
> more reliable.
>
I think it's useful to make the bounce-buffering logic more robust in
cases where it's not executed in the owning process; it's also a really
simple set of changes.  Of course doing vslock beforehand is still going
to be the only safe way to use that API, but that seems reasonable if
it's documented and done sparingly (which it is).
In the longer term, vm_fault_quick_hold_pages + _bus_dmamap_load_ma is
probably better for user buffers, at least for short transfers (which I
think is most of them).  load_ma needs to at least be made a public and
documented KPI though.  I'd like to try moving some of the drm2 code to
use it once I finally have a reasonably modern machine for testing -current.

Either _bus_dmamap_load_ma or out-of-context UIO_USERSPACE bounce
buffering could have issues with waiting on sfbufs on some arches,
including arm.  That could be fixed by making each unmapped bounce
buffer set up a kva mapping for the data addr when it's created, but
that fix might be worse than the problem it's trying to solve.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 603 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freebsd.org/pipermail/freebsd-arch/attachments/20150428/d6498441/attachment.sig>


More information about the freebsd-arch mailing list