Coherent bus_dma for ARMv7

Zbigniew Bodek zbb at semihalf.com
Wed Apr 5 10:22:20 UTC 2017


2017-04-04 18:32 GMT+02:00 Andrew Turner <andrew at fubar.geek.nz>:

>
> On 3 Apr 2017, at 15:58, Zbigniew Bodek <zbb at semihalf.com> wrote:
>
> 2017-04-03 16:37 GMT+02:00 Andrew Turner <andrew at fubar.geek.nz>:
>
>>
>> > On 3 Apr 2017, at 15:14, Zbigniew Bodek <zbb at semihalf.com> wrote:
>> >
>> > 2017-04-03 15:37 GMT+02:00 Andrew Turner <andrew at fubar.geek.nz>:
>> >
>> > > On 3 Apr 2017, at 14:16, Marcin Wojtas <mw at semihalf.com> wrote:
>> > >
>> > > Hi Adrian,
>> > >
>> > > Frankly we are not such experts in armv6 bus_dma, which looks more
>> > > complicated than one in arm64, so we thought it's much safer no to mix
>> > > the two solutions and leave for the user a single switch to decide,
>> > > which one to pick. Afaik Andrew Turner did the oposite for arm64
>> > > (implement not coherent solution on top of coherent bus_dma), however
>> > > I'm not sure if it's possible here in an easy way - there's also
>> > > pretty significant risk of regression for all platforms. Please let me
>> > > know your opinion. Do you think some sort of update of armv6 is
>> > > doable?
>> >
>> > I don’t see any reason to think it would be difficult to add support
>> for coherent hardware to the existing armv6 busdma code. It’s mostly
>> skipping cache operations based on a flag in the dam tag.
>> >
>> > Andrew
>> >
>> > Hello Andrew,
>> >
>> > I don't think anyone uses flags related to DMA coherency in
>> bus_dma_tag_create.
>>
>> The generic PCI and ThunderX PCIe PEM drivers do. The former based on the
>> FDT dma-coherent flag.
>>
>
> In this particular example this will work as almost all (not all) devices
> on ThunderX are PCIe devices. For most ARMv7-based SoCs this is not true.
> We would need to create a coherent DMA tag for the top level buses and
> ensure that this is propagated correctly down to the subordinate buses and
> devices.
>
>
> You will already need to ensure the property is propagated to children,
> although DMA coherency is a property of the device, not the system, e.g. it
> is possible for only some devices to be coherent depending on how the
> vendor attached them to the internal bus.
>
>
>
>>
>> >
>> > Nevertheless, for coherent platforms we want bus_dma to always map DMA
>> memory as normal WBWA regardless of the flags passed to create a bus_dma
>> MAP.
>> > For example, we don't want to perform any synchronization and we want
>> to have the cacheable memory regardless of  BUS_DMA_COHERENT flag used.
>>
>> That’s already the case on arm64, the only synchronisation used when the
>> tag is created with BUS_DMA_COHERENT is a memory barrier.
>>
>
> For PCI.
>
>
> I have non-PCI devices with coherent DMA, I just haven’t had a chance to
> finish testing and upstreaming the patches.
>
>
>
>>
>> > Otherwise the performance improvement will apply only to those drivers
>> that dare to use BUS_DMA_COHERENT flag and very few of them does that. In
>> other words, what is the point of having coherent DMA if you do cache
>> maintenance anyway?
>>
>> The drivers should be getting the parent DMA tag and passing this to
>> bus_dma_tag_create. If this was created with BUS_DMA_COHERENT it will pass
>> this to the child tag. This is how the above PCI drivers work.
>>
>>
> This basically makes sense to me if we do the same for all buses or once
> for every platform. The question is how much additional stuff is added to
> busdma_machdep-v6.c to make it work on all relevant platforms because it is
> quite different from the ARM64 implementation.
>
>
> It should just be setting a flag then using it to always allocate
> cacheable memory and stop performing cache operations on a sync. It
> shouldn’t affect existing platforms as they won’t set the appropriate flag
> when creating the tag.
>
>
> Still we can go with the ARM64 approach and add new DMA handling, parallel
> to the existing one. Improve it over time to handle non-coherent DMA and
> replace the old one with the new one when it is proven to be correct for
> all.
>
>
> The arm64 approach would be to handle BUS_DMA_COHERENT when creating a
> tag, and using this to decide how to correctly sync the DMA memory. The
> current code has been well tested on multiple different SoCs.
>
>
Hello Andrew,

Thanks for the elaborated response. Please send your patches and we will
base on that then on armv7.

Kind regards
zbb


More information about the freebsd-arm mailing list