cvs commit: src/sys/kern uipc_mbuf.c src/sys/sys mbuf.h

Thu Feb 24 20:27:13 GMT 2005

On Thu, 24 Feb 2005, Mike Silbersack wrote:

> > I just started tracking a bug report from Peter Holm in which if_rl free's
> > an already free'd mbuf, and tracked it back to the following problem: when
> > you went through and adapted various drivers to use m_defrag(), two bugs
> > were introduced:
> 
> I don't see any flaw in if_rl's use of m_defrag, please be more specific
> as to what the error is. 

Sorry, I should have been more specific.  A patch is attached.  Basically,
'm_head' can be modified, but on failure, the caller's version of m_head
isn't modified, as it's passed by value, not reference.  So the caller may
be using the wrong mbuf.  In some cases, the caller doesn't know how to
handle m_head being returned as NULL, and will try to unconditionally
re-insert the mbuf into the interface queue even if it's NULL.  I hadn't
realized that was a problem when I wrote the below patch, so haven't
checked if if_rl needs to be tweaked to handle that (most others do need
to be tweaked, so rl probably does also).

> > (1) Callers of m_defrag() did not properly handle the case where
> >    m_defrag() would return a new mbuf cluster as the head.  Specifically,
> >    on encapsulation failure, they might requeue the old head in the ifnet
> >    queue.
> 
> m_defrag ALWAYS returns a new mbuf cluster as the head! 

Then it's definitely being used incorrectly :-).

> 
> > (2) Callers of m_defrag() did not properly handle the case where
> >    m_defrag() would return NULL due to mbuf exhaustion.  Specifically, on
> >    encapsulation failure in the case where m_defrag() fails, they might
> >    attempt to enqueue a NULL mbuf pointer or a free'd mbuf pointer into
> >    the ifnet queue.
> 
> Sounds possible, we'll have to do a sweep for this.  Alternately, we
> could ask that people enable the MBUF_STRESS_TEST option and try running
> various netperf tests with kern.ipc.m_defragrandomfailures=1 and with
> net.inet.ip.mbuf_frag_size=-1 and -2.  Those tests should be sufficient
> to expose any bugs in m_defrag usage. 

I think we can find them by inspection, but making sure there fixed
requires testing :-).

Robert N M Watson

> 
> Mike "Silby" Silbersack
> 


Index: if_rl.c
===================================================================
RCS file: /home/ncvs/src/sys/pci/if_rl.c,v
retrieving revision 1.147
diff -u -r1.147 if_rl.c

--- if_rl.c	11 Feb 2005 01:05:52 -0000	1.147
+++ if_rl.c	24 Feb 2005 10:03:02 -0000
@@ -180,7 +180,7 @@
 static void rl_dma_map_txbuf	(void *, bus_dma_segment_t *, int, int);
 static void rl_eeprom_putbyte	(struct rl_softc *, int);
 static void rl_eeprom_getword	(struct rl_softc *, int, uint16_t *);
-static int rl_encap		(struct rl_softc *, struct mbuf * );
+static int rl_encap		(struct rl_softc *, struct mbuf ** );
 static int rl_list_tx_init	(struct rl_softc *);
 static int rl_ifmedia_upd	(struct ifnet *);
 static void rl_ifmedia_sts	(struct ifnet *, struct ifmediareq *);
@@ -1394,9 +1394,10 @@
  * pointers to the fragment pointers.
  */
 static int
-rl_encap(struct rl_softc *sc, struct mbuf *m_head)
+rl_encap(struct rl_softc *sc, struct mbuf **m_headp)
 {
 	struct mbuf		*m_new = NULL;
+	struct mbuf		*m_head;
 
 	RL_LOCK_ASSERT(sc);
 
@@ -1405,10 +1406,12 @@
 	 * TX buffers, plus we can only have one fragment buffer
 	 * per packet. We have to copy pretty much all the time.
 	 */
+	m_head = *m_headp;
 	m_new = m_defrag(m_head, M_DONTWAIT);
 
 	if (m_new == NULL) {
 		m_freem(m_head);
+		*m_headp = NULL;
 		return (1);
 	}
 	m_head = m_new;
@@ -1429,7 +1432,7 @@
 	}
 
 	RL_CUR_TXMBUF(sc) = m_head;
-
+	*m_headp = m_head;
 	return (0);
 }
 
@@ -1461,7 +1464,7 @@
 		if (m_head == NULL)
 			break;
 
-		if (rl_encap(sc, m_head))
+		if (rl_encap(sc, &m_head))
 			break;
 
 		/* Pass a copy of this mbuf chain to the bpf subsystem. */