arm/160431: [patch] Disable interrupts during busdma cache sync
operations.
Ian Lepore
freebsd at damnhippie.dyndns.org
Sat Sep 3 17:20:12 UTC 2011
>Number: 160431
>Category: arm
>Synopsis: [patch] Disable interrupts during busdma cache sync operations.
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: freebsd-arm
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Sat Sep 03 17:20:12 UTC 2011
>Closed-Date:
>Last-Modified:
>Originator: Ian Lepore <freebsd at damnhippie.dyndns.org>
>Release: FreeBSD 8.2-RC3 arm
>Organization:
none
>Environment:
FreeBSD dvb 8.2-RC3 FreeBSD 8.2-RC3 #49: Tue Feb 15 22:52:14 UTC 2011 root at revolution.hippie.lan:/usr/obj/arm/usr/src/sys/DVB arm
>Description:
Data can be corrupted when an interrupt occurs while busdma_sync_buf() is
handling a buffer that partially overlaps a cache line. One scenario, seen
in the real world, was a small IO buffer allocated in the same cache line
as an instance of a struct intr_handler. The dma sync code copied the non-DMA
data (the intr_handler struct) to a temporary buffer prior to the cache sync,
then an interrupt occurs that results in setting the it_need flag in the
struct. When control returns to the dma sync code it finishes by copying
the saved partial cache line from the temporary buffer back to the
intr_handler struct, restoring the it_need flag to zero, and resulting in
a threaded interrupt handler not running as needed.
Similar sequences can be imagined that would lead to corruption of either
the DMA'd data or non-DMA data sharing the same cache line, depending on the
timing of the interrupt, and I can't quite convince myself that the problem
only occurs in this partial-cacheline-overlap scenario. For example, what
happens if execution is in the middle of a cpu_dcache_wbinv_range() operation
and an interrupt leads to a context switch wherein the pmap code decides to
call cpu_dcache_inv_range()? So to be conservatively safe, this patch
disables interrupts for the entire duration bus_dmamap_sync_buf(), not just
when partial cache lines are being handled.
>How-To-Repeat:
It would be very difficult to set up a repeatable test of this condition. We
were lucky (!) enough to have it happen repeatably enough to diagnose.
>Fix:
Problem was discovered in an 8.2 environment, but this diff is to -current.
--- diff.tmp begins here ---
--- busdma_machdep.c.orig 2010-03-11 14:16:54.000000000 -0700
+++ busdma_machdep.c 2011-09-03 10:15:16.000000000 -0600
@@ -1091,6 +1091,14 @@ static void
bus_dmamap_sync_buf(void *buf, int len, bus_dmasync_op_t op)
{
char _tmp_cl[arm_dcache_align], _tmp_clend[arm_dcache_align];
+ uint32_t intr;
+
+ /* Interrupts MUST be disabled when handling partial cacheline flush
+ * and most likely should be disabled for all flushes. (I know for
+ * certain interrupts can cause failures on partial flushes, and suspect
+ * problems could also happen in other scenarios.)
+ */
+ intr = intr_disable();
if ((op & BUS_DMASYNC_PREWRITE) && !(op & BUS_DMASYNC_PREREAD)) {
cpu_dcache_wb_range((vm_offset_t)buf, len);
@@ -1129,6 +1137,8 @@ bus_dmamap_sync_buf(void *buf, int len,
arm_dcache_align - (((vm_offset_t)(buf) + len) &
arm_dcache_align_mask));
}
+
+ intr_restore(intr);
}
static void
--- diff.tmp ends here ---
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-arm
mailing list