Some busdma stats

Ian Lepore freebsd at damnhippie.dyndns.org
Wed Sep 5 15:10:35 UTC 2012


On Wed, 2012-09-05 at 08:30 -0600, Warner Losh wrote:
> > Regardless of whether we eventually fix every driver to eliminate
> > transfers that aren't aligned to cache line boundaries, or somehow
> > change the busdma code to automatically bounce unaligned requests,
> we
> > need efficient allocation of small buffers aligned and sized to
> cache
> > lines.
> 
> The issue can't be fixed in the busdma code because partial, unaligned
> transfers are fine, so long as the calling code avoids the entire
> cache line during the transfer.  Returning cache-line aligned buffers
> from the allocator will do that, of course, but it is also valid for
> the code to only use part of the buffer for the transfer.

Right.  My goal with the dma buffer pool changes isn't some sort of
magical automatic fix in the busdma layer, it's just a whittling away of
one small roadblock on the path to fixing this stuff.  When I first
started asking about how we should address these problems, the experts
said to keep platform-specific alignment and padding information
encapsulated within the busdma layer rather than inventing a new
mechanism to export that info to drivers.  That implies that drivers
should be allocating DMA buffers from busdma instead of allocating big
chunks of memory and sub-dividing them into smaller buffers.  

For that to work, the busdma implementation needs to be able to
efficiently allocate buffers that are properly aligned and padded and
especially that are guaranteed not to share a cache line with some other
unrelated data.  The busdma implementation can't get those guarantees
from malloc(9), and the alternatives (contigmalloc(), and the kmem_alloc
family) only work in page-sized chunks.  We're asking drivers to
allocate individual buffers of sometimes no more than a few bytes each.

So that's all I'm addressing in the patchset I submitted:  make sure
that when we start fixing drivers to allocate 256 individual 16-byte IO
descriptors that it shares with the hardware, that doesn't result in
allocating 256 pages of memory.  Also, if the request is for
BUS_DMA_COHERENT memory, make sure that doesn't result in turning off
caching in up to 256 pages that each contain a 16 byte IO buffer and
4080 bytes of unrelated data.

-- Ian




More information about the freebsd-arch mailing list