Adding support for WC (write-combining) memory to bus_dma

John Baldwin jhb at
Thu Jul 12 18:11:13 UTC 2012

On Thursday, July 12, 2012 12:37:13 pm Scott Long wrote:
> ----- Original Message -----
> > From: John Baldwin <jhb at>
> > To: current at
> > Cc: scottl at; Peter Jeremy <peter at>
> > Sent: Thursday, July 12, 2012 7:40 AM
> > Subject: Adding support for WC (write-combining) memory to bus_dma
> > 
> > I have a need to allocate static DMA memory via bus_dmamem_alloc() that is 
> > also WC (for a PCI-e device so it can use "nosnoop" transactions).  
> > This is 
> > similar to what the nvidia driver needs, but in my case it is much cleaner to 
> > allocate the memory via bus dma since the existing code I am extending all 
> > uses busdma.
> > 
> > I have a patch to implement this on 8.x for amd64 that I can port to HEAD if 
> > folks don't object.  What I would really like to do is add a new paramter to 
> > 
> > bus_dmamem_alloc() to specify the memory attribute to use, but I am hesitant 
> > to break that API.  Instead, I added a new flag similar to the existing 
> > BUS_DMA_NOCACHE used to allocate UC memory.
> > 
> Please don't add new parameters.  Now that I'm carefully documenting the
> evolution of the APIs, it's becoming glaringly apparent how sloppy we are
> with API design and interface compatibility.  I'm just as guilty of it as anyone,
> but I'd really like to see less instances of call signature changes in existing
> functions; they make driver maintenance tedious and are hard to effectively
> document.  Some options I can think of:
> 1.  create bus_dmamem_alloc_attr().  I don't really like leafy API growth like
> this either, but it's not a horrible solution.

I would actually not oppose this either.  In this particular case it is a bit
useful as I think the user shouldn't have to explicitly state a default if
they don't need a non-standard attribute.

Side-band comment: if I were going to change the API of bus_dmamem_alloc(), my
biggest request would be a bus_dmamem_alloc_size() that takes the size to 
allocate as a parameter rather than pulling the size out of the tag.  I always
think of a tag as describing a DMA engine's capabilities and restrictions,
whereas the size of, say, a descriptor ring is often a software-configurable
knob (e.g. hw.igb.nrxd) and thus the allocation size isn't really a property
of the DMA engine.  I also find this to be the least intuitive behavior in the
bus DMA API.  But that is an entirely different ball of wax.

> 2.  There are existing placeholder flags, BUS_DMA_BUS[1234] that could be
> aliased and repurposed to hold 4 bits of attribute information for this function.
> The 3 and 4 variants are already in use, but I haven't looked closely to see
> their scope.

Humm, I could do that.  In practice WC is only really applicable on x86.
Also, to be honest, I doubt anyone will ever use special attributes besides
UC and WC for DMA on x86.  That is the main reason why I just added a new flag
and didn't try to add a generic scheme for specifying the memory attribute.

I could easily just move BUS_DMA_WRITE_COMBINING into x86's <machine/bus_dma.h>
and have it use one of the free placeholder flags.  That is probably the
simplest short term solution.

> 3.  Reserve the top 16 bits of the flags for attribute information.
> 4.  Move the attribute information into the tag and create new setter/getter
> accessors for attribute information.  This would probably be the cleanest,
> though it breaks the existing sloppiness of allowing different pseudo-attributes
> for different allocations under the same tag.  I've wanted to break down the
> existing bus_dma_tag_create() into finer-grained setter/getters for a while in
> any case.
> 5.  Move the attribute information into the map and force everyone to start
> creating maps for static memory allocations.  This would actually add some
> missing uniformity to the API and might actually be cleaner that option 4.
> > While doing this, I ran into an old bug, which is that if you were to call 
> > bus_dmamem_alloc() with BUS_DMA_NOCACHE but a tag that otherwise fell through 
> > to using malloc() instead of contigmalloc(), bus_dmamem_alloc() would actually
> > change the state of the entire page.  This seems wrong.  Instead, I think that 
> > any request for a non-default memory attribute should always use 
> > contigmalloc().  In fact, even better is to call kmem_alloc_contig() directly
> > rather than using contigmalloc().  However, if you change this, then 
> > bus_dmamem_free() won't always DTRT as it doesn't have enough state to 
> > know if
> > a small allocation should be free'd via free() or contigfree() (the latter 
> > would be required if it used a non-default memory attribute).  The fix I used 
> > for this was to create a new dummy dmamap that is returned by bus_dmamem_alloc 
> > if it uses contigmalloc().  bus_dmamem_free() then checks the passed in map 
> > pointer to decide which type of free to perform.  Once this is fixed, the 
> > actual WC support is rather trivial as it merely consists of passing a 
> > different argument to kmem_alloc_contig().
> Yup, this is a problem, and I like your fix; this kind of state is exactly what
> belongs in the map.

Why don't I break out the fix first as a separate patch.

John Baldwin

More information about the freebsd-current mailing list