Kernel memory corruption(?) with age(4)

YongHyeon PYUN pyunyh at gmail.com
Thu Mar 31 18:32:10 UTC 2011


On Thu, Mar 31, 2011 at 11:16:52AM -0700, YongHyeon PYUN wrote:
> On Thu, Mar 31, 2011 at 08:07:17PM +0200, Yamagi Burmeister wrote:
> > On Thu, 31 Mar 2011, YongHyeon PYUN wrote:
> > 
> > >>All boxes are quadcore machines with 8GB RAM, running FreeBSD/amd64.
> > >>After limiting the memory via hw.physmem to 3GB the problems are gone.
> > >>The box is running crashfree for more than 6 hours and has served over
> > >>300GB of data via age(4).
> > >>
> > >
> > >Thanks for testing. Remove the hw.physmem configuration and try
> > >attached patch and let me know how it goes.
> > 
> > Thanks for your help, but the patch doesn't work. Another random panic -
> > this time "page fault in kernel mode" - with nothing age(4) or network
> > stack related stuff in the backtrace...
> > 
> > Maybe it'll help to know about a bug fix in the linux atl1 driver, now
> > replaced by atlx. In git commit 5f08e46b621a769e52a9545a23ab1d5fb2aec1d4
> > 64 bit DMA was disabled:
> > 
> >   64-bit DMA causes data corruption with atl1.  We don't know why, and
> >   Atheros is working on it. For now, just use 32-bit DMA. This is a big
> >   hack that is probably wrong, but it stops the bleeding.
> > 
> > There was no later follow up on it. I think that this can't be problem
> > on FreeBSD but maybe I'm reading the driver code wrong. The kernel.org
> > gitweb URL is:
> > 
> > http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.23.y.git;a=commitdiff;h=5f08e46b621a769e52a9545a23ab1d5fb2aec1d4
> > 
> 
> Thanks a lot! It seems the L1 controller has data corruption issue
> when 64bit DMA addressing is used. Try this one.

Oops, there was a bug in previous patch.
Try this instead.
-------------- next part --------------
Index: sys/dev/age/if_age.c
===================================================================
--- sys/dev/age/if_age.c	(revision 220116)
+++ sys/dev/age/if_age.c	(working copy)
@@ -1092,11 +1092,14 @@
 	 * Create Tx/Rx buffer parent tag.
 	 * L1 supports full 64bit DMA addressing in Tx/Rx buffers
 	 * so it needs separate parent DMA tag.
+	 * XXX
+	 * It seems enabling 64bit DMA causes data corruption. Limit
+	 * DMA address space to 32bit.
 	 */
 	error = bus_dma_tag_create(
 	    bus_get_dma_tag(sc->age_dev), /* parent */
 	    1, 0,			/* alignment, boundary */
-	    BUS_SPACE_MAXADDR,		/* lowaddr */
+	    BUS_SPACE_MAXADDR_32BIT,	/* lowaddr */
 	    BUS_SPACE_MAXADDR,		/* highaddr */
 	    NULL, NULL,			/* filter, filterarg */
 	    BUS_SPACE_MAXSIZE_32BIT,	/* maxsize */
@@ -2452,6 +2455,9 @@
 		/* Update the consumer index. */
 		sc->age_cdata.age_rr_cons = rr_cons;
 
+		bus_dmamap_sync(sc->age_cdata.age_rx_ring_tag,
+		    sc->age_cdata.age_rx_ring_map,
+		    BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE);
 		/* Sync descriptors. */
 		bus_dmamap_sync(sc->age_cdata.age_rr_ring_tag,
 		    sc->age_cdata.age_rr_ring_map,


More information about the freebsd-net mailing list