Unsolved problem with WB caches on ARMv6

Oleksandr Tymoshenko gonzo at bluezbox.com
Sat Jan 5 06:26:35 UTC 2013


On 2012-12-09, at 10:24 PM, Oleksandr Tymoshenko <gonzo at bluezbox.com> wrote:

> Hello,
> 
> One of the long-time issues with FreeBSD/ARMv6 is that Write-Back cache
> mode does not work properly. On PandaBoard changing cache mode to WB from WT 
> causesUSB glitches (starting from stalls  to network packets corruption) and random 
> memory corruptions that manifest themselves as a userland programs crashes.
> 
> gber@ tracked down one of the bugs several month ago, but it's still unusable
> at least on my setup. 
> 
> I spent some time debugging through busdma and USB code but failed to find
> anything fishy. PandaBoard's USB host controller is EHCI. QH and QTDs are 
> flushed properly. Corruption pattern in packets is weird: it's not cacheline-size
> it's like chunk of data is just missing from bulk transfer DMA buffer. L2 cache
> is disabled. 
> 
> The issue is not reproducible in QEMU. 
> 
> Fix for arm/160431 applied to busdma-v6.c didn't help. 
> 
> I'm out of ideas for now. May be Ian or Alan will have some suggestions where to look?

Following up on this one. The cause for this issue was combination of several bugs and
bad practices:

1. USB subsystem  in general  and EHCI driver particularly didn't like that there is no supper
    cor cache-coherent memory in busdma subsystem.
2. PL310 driver bugs
3. pmap bugs.

Fixes for (1) and (2) have been committed recently and I believe I finally tracked down all 
bugs in pmap:

- Missing PTE_SYNC in pmap_kremove caused severe memory corruption in userland
    applications
- Lack of cache flushes when using special PTEs for zeroing or copying pages. If there are 
    dirty lines for destination memory and page later remapped as a non-cached region
    actual content might be overwritten by these dirty lines when cache eviction happens 
   as a result of applying cache eviction policy or because of wbinv_all call. 
- icache sync for new mapping for userland applications.

Attached patch addresses these issues. Please review and test.
If you see something like this:

vm_thread_new: kstack allocation failed
panic: kproc_create() failed with 12
KDB: enter: panic
 
apply this patch: http://people.freebsd.org/~gonzo/patches/queue/arm-autotune-fix.diff

Some bits of statistics I gathered while working on this issue. 

As a test platform I used pandaboard ES with root mounted over NFS and as a test -  
buildkernel ran in loop with PANDABOARD as a config file. Average time for building 
kernel with L2 cache disabled is about 3 hours. With L2 cache enabled and 
write-through as a default mode: 1h10m. With L2 enabled and writeback-allocate mode 
as default: 22 minutes. 

Performance  gain on raspberry Pi was marginal though. I blame slow network connection
since it works over USB in PIO mode. When RPi will get faster USB/mmc support 
actual difference may be  more substantial. 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: pmap-wb-caches-fix.diff
Type: application/octet-stream
Size: 3182 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-arm/attachments/20130104/38356d58/attachment.obj>


More information about the freebsd-arm mailing list