svn commit: r260440 - head/sys/arm/conf
    Nathan Whitehorn 
    nwhitehorn at freebsd.org
       
    Wed Jan  8 15:50:12 UTC 2014
    
    
  
On 01/08/14 10:19, Ian Lepore wrote:
> On Tue, 2014-01-07 at 23:19 -0500, Nathan Whitehorn wrote:
>> On 01/07/14 22:40, Ian Lepore wrote:
>>> Author: ian
>>> Date: Wed Jan  8 03:40:18 2014
>>> New Revision: 260440
>>> URL: http://svnweb.freebsd.org/changeset/base/260440
>>>
>>> Log:
>>>   Add option USB_HOST_ALIGN to configs that contain 'device usb'.  Setting
>>>   this to the cache line size is required to avoid data corruption on armv4
>>>   and armv5, and improves performance on armv6, in both cases by avoiding
>>>   partial cacheline flushes for USB IO.
>>>   
>>>   All these configs already exist in 10-stable.  A few that don't (and
>>>   thus can't be MFC'd yet) will be committed separately.
>>>
>> There has to be -- and I do not mean this as a criticism of your patch
>> -- a better solution to this problem than USB_HOST_ALIGN. Isn't busdma
>> supposed to handle this kind of thing? Why is USB different?
>> -Nathan
>>
> USB is different because it doesn't follow the busdma rules.  It
> allocates one large buffer, then sub-divides it internally into bits
> that are used for DMA IO and adjacent bits that are accessed by the cpu
> concurrently with the DMA.  If it doesn't do that subdividing with an
> awareness of the cache line boundaries, it ends up with concurrent CPU
> and DMA access to data in the same cache line, and there's no way a
> software-assisted cache coherency scheme can reliably do busdma sync ops
> that don't corrupt either the CPU data or the DMA data.
>
> On armv6 we now automatically bounce IO that's not sized and aligned on
> cache line boundaries.  The overhead for doing so is non-trivial, doubly
> so in the case of USB, because it's the only consumer of busdma in the
> system that requires that the offset-within-page for a bounced IO be the
> same as the offset in the original page (so a pool of small bounce
> buffers for small unligned IOs is not an option, it must allocate full
> bounce pages for every IO).
>
> It used to be (on armv4) that when you used the busdma alloc functions
> to allocate small DMA buffers (a few bytes) the implementation allocated
> entire pages, which is pretty inefficient and can add up to a lot of
> allocation overhead.  That was cited as a reason not to change USB's
> "allocate big then subdivide" scheme.  I wrote new busdma allocators
> that use UMA pools to efficiently handle small aligned buffers of both
> normal and uncachable (BUSDMA_COHERENT) memory, so that's not a
> roadblock anymore.  (Arm uses the new allocator, mips never got
> converted.)
>
> So, since we keep getting reports on arm@ of data corruption that shows
> up as 32-byte chunks of bad data, and it costs real time and resources
> to try to debug each case, I figured we should just go with the fix that
> nobody likes but it actually works.
>
> -- Ian
>
>
Thanks for the explanation, the debugging, and the fix. This seems like
a straightforward bug in the USB stack. Can it be fixed, or are there
architectural reasons why it is the way it is?
-Nathan
    
    
More information about the svn-src-head
mailing list