m_pkthdr.rcvif dangling pointer problem

Gleb Smirnoff glebius at FreeBSD.org
Thu Jul 14 15:44:59 UTC 2011


  Hi!

  This problem is definitely known and is as old as network stack
is parallel. Those, who know the problem may skip to next paragraph.
Short description is the following: every mbuf that is allocated in
an interface receive procedure has a field where a pointer to the
corresponding struct ifnet is stored. Everything is okay, until you
are running a box, where configuration of interfaces is changing
rapidly, for example a PPP access concentrator. Utilizing any facility
that can delay packet processing for example netgraph(4) or dummynet(4)
would make probability of crash much bigger. Running INVARIANTS kernel
would crash a box with dummynet and disappearing interfaces in a few
minutes.

  I see three approaches to this problem:

  1) Every m_pkthdr.rcvif should if_ref() the interface. Releasing
references can be done in the mbuf allocator: mb_dtor_mbuf(),
mb_dtor_pack(). I'm afraid this approach would degrate performance,
since adding at least a couple of atomic ops on every mbuf for its
lifetime. That is +2 atomic ops per packet.

  2) kib@ suggested to allocate ifnets from a UMA_ZONE_NOFREE zone.
I've made a compilable & working patch:

http://people.freebsd.org/~glebius/patches/ifnet.no_free

But on second though I find this a bad idea, this is just fooling
of INVARIANTS. Yes, we avoid thrashing of freed memory and rewriting
it by some other kernel allocation. But still out pointer point to
invalid ifnet. Even, if we make a check for IFF_DYING flag, we still
can not guarantee that an interface had been re-allocated for a new
instance. This would be not a panic condition, but subtle bugs in
firewalls.

  3) As we now have a straight if_index table that can grow, what about
storing the if_index in the m_pkthdr? Lookup of interface by index
is fast enough if done lockless. Doing it lockless isn't perfect, but
better than current pointer dereferncing. Optionally it could be
done with locking and with putting a reference. To avoid situation
with with getting to a re-allocated interface with the same index,
we can use a unique cookie, that is incremented in if_alloc().
Smth like:

struct ifnet * 
mbuf_rcvif(struct mbuf *m)
{
        struct ifnet *ifp;
   
        M_ASSERTPKTHDR(m);

        if (m->m_pkthdr.rcvif_idx == 0)
                return (NULL);

        ifp = ifnet_byindex_locked(m->m_pkthdr.rcvif_idx);
        if (ifp == NULL)
                return (NULL);
        if (ifp->if_unique != m->m_pkthdr.rcvif_unique)
                return (NULL);

        return (ifp);  
}

struct ifnet *
mbuf_rcvif_ref(struct mbuf *m)
{
        struct ifnet *ifp;

        M_ASSERTPKTHDR(m);

        if (m->m_pkthdr.rcvif_idx == 0)
                return (NULL);

        ifp = ifnet_byindex_ref(m->m_pkthdr.rcvif_idx);
        if (ifp == NULL)
                return (NULL);
        if (ifp->if_unique != m->m_pkthdr.rcvif_unique) {
                if_rele(ifp);
                return (NULL);
        }

        return (ifp);
}

The size of struct mbuf isn't going to change on amd64:

@@ -111,7 +111,8 @@
  * Record/packet header in first mbuf of chain; valid only if M_PKTHDR is set.
  */
 struct pkthdr {
-       struct ifnet    *rcvif;         /* rcv interface */
+       uint32_t         rcvif_idx;     /* rcv interface index */
+       uint32_t         rcvif_unique;  /* rcv interface unique id */

A proof-of-concept patch is available here:

http://people.freebsd.org/~glebius/patches/pkthdr.rcvif_idx

It doesn't cover entire kernel, LINT won't compile. It covers kern, netinet,
netinet6, several interface drivers and netgraph.

One of the problems is ongoing namespace pollution: not all code that include
mbuf.h includes if_var.h and vice versa. I've cut ifnet from m_devget(), but
my new function do declare it in parameter list :(. To deal with that I had
to declare functions in mbuf.h but implement them in if.c.

Other problem is net80211 code that abuses the rcvif pointer in mbuf packet
header for its own purposes casting it to private type. This can be fixed
utilizing mbuf_tags(9), I think. I haven't made this yet, that's why LINT
doesn't compile.

  Comments are requested!

  Any alternative ideas on solving the problem are welcome!

-- 
Totus tuus, Glebius.


More information about the freebsd-net mailing list