LOR route vr0
Robert Watson
rwatson at FreeBSD.org
Sat Aug 27 17:44:54 GMT 2005
On Sat, 27 Aug 2005, M. Warner Losh wrote:
> : Generally speaking, network interface device driver locks follow network
> : stack locks in the lock order. However, I've not really looked much at
> : the route table locking so can't speak to whether that is the case
> : specifically for routing locks. If it is, the below traces reflect the
> : correct order, and you might want to add a hard-coded entry to witness in
> : order to catch the reverse order.
>
> Can you pose a quickie summary on how to do that? I tried last night and
> was unsuccessful...
You need to add an entry to subr_witness.c creating a graph edge between
the softc lock and the routing lock. An example of an entry in
subr_witness.c:
/*
* TCP/IP
*/
{ "tcp", &lock_class_mtx_sleep },
{ "tcpinp", &lock_class_mtx_sleep },
{ "so_snd", &lock_class_mtx_sleep },
{ NULL, NULL },
Note that sets of ordered entries are terminated with a double-null. This
declares that locks of type "tcp" preceed "tcpinp" which preceed
"so_snd".
> : Lock order reversals between the
> : network stack and device drivers tend to occur as a result of the device
> : driver calling into the network stack while holding the device driver
> : mutex.
>
> I'm as sure as I can be that no locks are held when I call INTO the
> network layer. As far as I can tell, I only do that when I call
> ifp->if_input, and I drop the locks to do that.
If I had to guess, you do a media status update, which can cause routing
socket events indicating the link went up or down.
> : Someone (tm) should work out if the right order is route locks ->
> : device driver locks, as it's likely a common calss of bugs across many
> : drivers.
>
> I just discovered the problem in my code. I'm not sure where the
> other order happens, but in my code I do the following:
>
> ED_LOCK(sc);
> ed_setrcr(sc);
> ed_ds_getmcst(sc);
> IF_ADDR_LOCK(sc->ifp);
> TAILQ_FOREACH(ifma, &sc->ifp->if_multiaddrs, ifma_link) {
> ...
> IF_ADDR_UNLOCK(sc->ifp);
> ED_UNLOCK(sc);
>
> since the lock for ED should be a leaf lock, this causes problems. I'm
> guessing that the network layer calls into the driver with this lock
> held. Without hard coding the locking into witness (see above), I'm
> unsure where this happens. A quick grep of the code doesn't reveal
> anything obvious...
I think this case should be OK, and we should document that as being the
case using a hard-coded witness entry.
> When I comment out the abouve IF_ADDR locks, I have no more LORs, but I
> think maybe other problems :-).
Hmmm. I was thinking that it was a separate issue. Could you try adding
a graph edge to witness forcing the ifaddrmtx's to fall before the driver
mutexes, in order to identify a path by which ifaddrmtx preceeds the
driver mutex?
Robert N M Watson
More information about the freebsd-current
mailing list