Call for testing and review, busdma changes
scott4long at yahoo.com
Tue Dec 25 03:13:18 UTC 2012
On Dec 24, 2012, at 6:03 PM, Ian Lepore <freebsd at damnhippie.dyndns.org> wrote:
> Yeah, I've done some low-level storage driver stuff myself (mmc/sd) and
> I can see how easy the deferred load solutions are to implement in that
> sort of driver that's already structured to operate asychronously. I'm
> not very familiar with how network hardware drivers interface with the
> rest of the network stack. I have some idea, I'm just not sure of all
> the subtleties involved and whether there are any implications for
> something like a deferred load.
> This is one of those situations where I tend to say to myself... the
> folks who designed this stuff and imposed the "no deferred load"
> restriction on mbufs and uio but not other cases were not stupid or
> lazy, so they must have had some other reason. I'd want to know what
> that was before I went too far with trying to undo it.
Deferring is expensive from a latency standpoint. For disks, this latency was comparatively small (until recent advances in SSD), so it didn't matter, but it did matter with network devices. Also, network drivers already had the concept of dropping mbufs due to resource shortages, and the strict requirement of guaranteed transactions with storage didn't apply. Deferring and freezing queues to guarantee delivery order is a pain in the ass, so the decision was made that it was cheaper to drop an mbuf on a resource shortage rather than defer. As for uio's, they're the neglected part of the API and there's really been no formal direction or master plan put into their evolution. Anyways, that's my story and I'm sticking to it =-)
Also, eliminating the concept of deferred load from mbufs then freed us to look at ways to make the load operation cheaper. There's a lot of code in _bus_dmamap_load_buffer() that is expensive, but a big one was the indirect function pointer for the callback in the load wrappers. The extra storage for filling in the temporary s/g list was also looked at. Going with direct loads allowed me to remove these and reduce most of the speed penalties.
>>> Still unresolved is what to do about the remaining cases -- attempts to
>>> do dma in arbitrary buffers not obtained from bus_dmamem_alloc() which
>>> are not aligned and padded appropriately. There was some discussion a
>>> while back, but no clear resolution. I decided not to get bogged down
>>> by that fact and to fix the mbuf and allocated-buffer situations that we
>>> know how to deal with for now.
Why would these allocations not be handled as normal dynamic buffers would with bus_dmamap_load()?
More information about the freebsd-arm