LOR route vr0

M. Warner Losh imp at bsdimp.com
Sat Aug 27 17:40:42 GMT 2005


In message: <20050827181827.O24510 at fledge.watson.org>
            Robert Watson <rwatson at FreeBSD.org> writes:
: 
: On Sat, 27 Aug 2005, M. Warner Losh wrote:
: 
: > In message: <Pine.BSF.4.53.0508270912550.969 at e0-0.zab2.int.zabbadoz.net>
: >            "Bjoern A. Zeeb" <bzeeb-lists at lists.zabbadoz.net> writes:
: > : > lock order reversal
: > : >  1st 0xc17621ec rtentry (rtentry) @ /usr/src/sys/net/route.c:1269
: > : >  2nd 0xc15ec938 vr0 (network driver) @ /usr/src/sys/pci/if_vr.c:1391
: > :
: > : added with ID 140: http://sources.zabbadoz.net/freebsd/lor.html#140
: >
: > I've noticed a *HUGE* number of LORs that look like this:
: >
: > ock order reversal
: > 1st 0xc17490e4 rtentry (rtentry) @ sys/netinet/if_ether.c:445
: > 2nd 0xc15c94b0 rl1 (network driver) @ sys/pci/if_rl.c:1451
: 
: Generally speaking, network interface device driver locks follow network 
: stack locks in the lock order.  However, I've not really looked much at 
: the route table locking so can't speak to whether that is the case 
: specifically for routing locks.  If it is, the below traces reflect the 
: correct order, and you might want to add a hard-coded entry to witness in 
: order to catch the reverse order.

Can you pose a quickie summary on how to do that? I tried last night
and was unsuccessful...

: Lock order reversals between the 
: network stack and device drivers tend to occur as a result of the device 
: driver calling into the network stack while holding the device driver 
: mutex.

I'm as sure as I can be that no locks are held when I call INTO the
network layer.  As far as I can tell, I only do that when I call
ifp->if_input, and I drop the locks to do that.

: Someone (tm) should work out if the right order is route locks -> 
: device driver locks, as it's likely a common calss of bugs across many 
: drivers.

I just discovered the problem in my code.  I'm not sure where the
other order happens, but in my code I do the following:

	ED_LOCK(sc);
	ed_setrcr(sc);
	    ed_ds_getmcst(sc);
		IF_ADDR_LOCK(sc->ifp);
		TAILQ_FOREACH(ifma, &sc->ifp->if_multiaddrs, ifma_link) {
		...		
		IF_ADDR_UNLOCK(sc->ifp);
	ED_UNLOCK(sc);

since the lock for ED should be a leaf lock, this causes problems.
I'm guessing that the network layer calls into the driver with this
lock held.  Without hard coding the locking into witness (see above),
I'm unsure where this happens.  A quick grep of the code doesn't
reveal anything obvious...

When I comment out the abouve IF_ADDR locks, I have no more LORs, but
I think maybe other problems :-).

Warner


More information about the freebsd-current mailing list