panic in RELENG_5 UMA

Gleb Smirnoff glebius at FreeBSD.org
Tue Jun 21 10:51:59 GMT 2005


[ cc'ing parties involved in this part of code]

On Tue, Jun 21, 2005 at 01:07:01PM +0400, Gleb Smirnoff wrote:
T> On Tue, Jun 21, 2005 at 09:04:27AM +0200, Jeremie Le Hen wrote:
T> J> #25 0xc05a0a0b in m_freem (mb=0x0) at uma.h:304
T> J> No locals.
T> J> #26 0xc05ee0d5 in arpresolve (ifp=0xc1a5b000, rt0=0xc1d44000, m=0xc1be7200,
T> J>     dst=0xd6d3fa94, desten=0xd6d3fa2c "/??]??????w??")
T> J>     at ../../../netinet/if_ether.c:442
T> J>         la = (struct llinfo_arp *) 0xc1a75a00
T> J>         sdl = (struct sockaddr_dl *) 0xc2128910
T> J>         error = -1038972656
T> J>         rt = (struct rtentry *) 0xc1d44000
T> 
T> IMHO, this looks like a race. The route is not locked, when
T> its llinfo is edited.
T> 
T> Probably the mbuf was freed when arp reply arrived and la_hold was send.
T> Look into in_arpinput() near 736:
T> 
T>                         (*ifp->if_output)(ifp, la->la_hold, rt_key(rt), rt);
T>                         la->la_hold = 0;
T> 
T> Yeah, I have just triggered another panic running 15 instances of this script on
T> SMP box:
T> 
T> (
T> while (true); do
T> 	arp -d 81.19.64.111  >/dev/null 2>&1;
T> 	ping -c 1 -t 1 81.19.64.111 >/dev/null 2>&1;
T> done
T> ) &
T> 
T> But my duplicate free is in fxp_txeof(). This means that output thread has
T> won the race.

I suppose that the attached patch closes your race. However, there is still
race between RTM_DELETE and output path. The above script still drops kernel
to panic, but the other one. Output path works with already freed llinfo:

#28 0xc0507000 in m_freem (mb=0x0) at mbuf.h:410
#29 0xc053fde3 in arpresolve (ifp=0xc2012800, rt0=0xc22fcdec, m=0xc25a8000, dst=0xe720bb28, 
    desten=0xe720bacc "uЬbю+\001") at /usr/src/sys/netinet/if_ether.c:443
#30 0xc0538078 in ether_output (ifp=0xc2012800, m=0xc25a8000, dst=0xe720bb28, rt0=0xc22fcdec)
    at /usr/src/sys/net/if_ethersubr.c:173
#31 0xc054b5b4 in ip_output (m=0xc25a8000, opt=0xc25a80ac, ro=0xe720bb24, flags=0x20, imo=0x0, inp=0xc25eb5a0)
    at /usr/src/sys/netinet/ip_output.c:772
#32 0xc054d36b in rip_output (m=0xc25a8000, so=0x0, dst=0x0) at /usr/src/sys/netinet/raw_ip.c:320
#33 0xc054de7b in rip_send (so=0xc248c914, flags=0x0, m=0xc25a8000, nam=0xc218d410, control=0x0, td=0xc224d7d0)
    at /usr/src/sys/netinet/raw_ip.c:785
#34 0xc050a30f in sosend (so=0xc248c914, addr=0xc218d410, uio=0xe720bc3c, top=0xc25a8000, control=0x0, flags=0x0, 
    td=0xc224d7d0) at /usr/src/sys/kern/uipc_socket.c:827

(kgdb) frame 29
#29 0xc053fde3 in arpresolve (ifp=0xc2012800, rt0=0xc22fcdec, m=0xc25a8000, dst=0xe720bb28, 
    desten=0xe720bacc "uЬbю+\001") at /usr/src/sys/netinet/if_ether.c:443
443                     m_freem(la->la_hold);
(kgdb) p *la
$3 = {
  la_le = {
    le_next = 0xdeadc0de, 
    le_prev = 0xdeadc0de
  }, 
  la_rt = 0xdeadc0de, 
  la_hold = 0xdeadc0de, 
  la_preempt = 0xc0de, 
  la_asked = 0xdead
}

Fixing this one is harder. We take la from unlocked rtentry obtained via
rt_check(), or from arplookup(). The latter drops lock on rtentry, too.
Then we do some work and use this la. It may have already been freed in
arp_rtrequest(), the RTM_DELETE case.

I see two approaches here:

1) Protecting llinfo with route lock. In this case we need rt_check()
to return locked *rt (just reference won't help). We also need
arplookup() to return locked rt. And do not unlock it withing all
arpresolve() and a big part of in_arpinput() functions.

2) Add mutex to llinfo_arp. I'm afraid this will hurt performance.

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE
-------------- next part --------------
Index: if_ether.c
===================================================================
RCS file: /home/ncvs/src/sys/netinet/if_ether.c,v
retrieving revision 1.137
diff -u -r1.137 if_ether.c
--- if_ether.c	5 Jun 2005 03:13:12 -0000	1.137
+++ if_ether.c	21 Jun 2005 10:36:08 -0000
@@ -438,11 +438,11 @@
 	 * response yet.  Replace the held mbuf with this
 	 * latest one.
 	 */
+	RT_LOCK(rt);
 	if (la->la_hold)
 		m_freem(la->la_hold);
 	la->la_hold = m;
 	if (rt->rt_expire) {
-		RT_LOCK(rt);
 		rt->rt_flags &= ~RTF_REJECT;
 		if (la->la_asked == 0 || rt->rt_expire != time_second) {
 			rt->rt_expire = time_second;
@@ -459,8 +459,8 @@
 			}
 
 		}
-		RT_UNLOCK(rt);
 	}
+	RT_UNLOCK(rt);
 	return (EWOULDBLOCK);
 }
 
@@ -642,6 +642,8 @@
 		goto reply;
 	la = arplookup(isaddr.s_addr, itaddr.s_addr == myaddr.s_addr, 0);
 	if (la && (rt = la->la_rt) && (sdl = SDL(rt->rt_gateway))) {
+		struct mbuf *hold;
+
 		/* the following is not an error when doing bridging */
 		if (!bridged && rt->rt_ifp != ifp
 #ifdef DEV_CARP
@@ -729,11 +731,13 @@
 		if (rt->rt_expire)
 			rt->rt_expire = time_second + arpt_keep;
 		rt->rt_flags &= ~RTF_REJECT;
-		RT_UNLOCK(rt);
 		la->la_asked = 0;
 		la->la_preempt = arp_maxtries;
-		if (la->la_hold) {
-			(*ifp->if_output)(ifp, la->la_hold, rt_key(rt), rt);
+		hold = la->la_hold;
+		la->la_hold = 0;
+		RT_UNLOCK(rt);
+		if (hold) {
+			(*ifp->if_output)(ifp, hold, rt_key(rt), rt);
 			la->la_hold = 0;
 		}
 	}


More information about the freebsd-stable mailing list