m_pkthdr.rcvif dangling pointer problem

Kostik Belousov kostikbel at gmail.com
Thu Jul 14 19:25:08 UTC 2011


On Thu, Jul 14, 2011 at 07:44:57PM +0400, Gleb Smirnoff wrote:
>   Hi!
> 
>   This problem is definitely known and is as old as network stack
> is parallel. Those, who know the problem may skip to next paragraph.
> Short description is the following: every mbuf that is allocated in
> an interface receive procedure has a field where a pointer to the
> corresponding struct ifnet is stored. Everything is okay, until you
> are running a box, where configuration of interfaces is changing
> rapidly, for example a PPP access concentrator. Utilizing any facility
> that can delay packet processing for example netgraph(4) or dummynet(4)
> would make probability of crash much bigger. Running INVARIANTS kernel
> would crash a box with dummynet and disappearing interfaces in a few
> minutes.
> 
>   I see three approaches to this problem:
> 
>   1) Every m_pkthdr.rcvif should if_ref() the interface. Releasing
> references can be done in the mbuf allocator: mb_dtor_mbuf(),
> mb_dtor_pack(). I'm afraid this approach would degrate performance,
> since adding at least a couple of atomic ops on every mbuf for its
> lifetime. That is +2 atomic ops per packet.
> 
>   2) kib@ suggested to allocate ifnets from a UMA_ZONE_NOFREE zone.
> I've made a compilable & working patch:
> 
> http://people.freebsd.org/~glebius/patches/ifnet.no_free
> 
> But on second though I find this a bad idea, this is just fooling
> of INVARIANTS. Yes, we avoid thrashing of freed memory and rewriting
> it by some other kernel allocation. But still out pointer point to
> invalid ifnet. Even, if we make a check for IFF_DYING flag, we still
> can not guarantee that an interface had been re-allocated for a new
> instance. This would be not a panic condition, but subtle bugs in
> firewalls.
This is not quite what I suggested (not looking at the patch).
I also noted that some measures to prevent the dereference of outdated
pointers should be taken. We discussed the generation counter for the
ifnets, and I pointed out that m_hdr has at least two MI scratch bytes.
Also, on amd64, there seems to be one spare word in pkthdr dur to
alignment.

> 
>   3) As we now have a straight if_index table that can grow, what about
> storing the if_index in the m_pkthdr? Lookup of interface by index
> is fast enough if done lockless. Doing it lockless isn't perfect, but
> better than current pointer dereferncing. Optionally it could be
> done with locking and with putting a reference. To avoid situation
> with with getting to a re-allocated interface with the same index,
> we can use a unique cookie, that is incremented in if_alloc().
> Smth like:
> 
> struct ifnet * 
> mbuf_rcvif(struct mbuf *m)
> {
>         struct ifnet *ifp;
>    
>         M_ASSERTPKTHDR(m);
> 
>         if (m->m_pkthdr.rcvif_idx == 0)
>                 return (NULL);
> 
>         ifp = ifnet_byindex_locked(m->m_pkthdr.rcvif_idx);
>         if (ifp == NULL)
>                 return (NULL);
>         if (ifp->if_unique != m->m_pkthdr.rcvif_unique)
>                 return (NULL);
> 
>         return (ifp);  
> }
> 
> struct ifnet *
> mbuf_rcvif_ref(struct mbuf *m)
> {
>         struct ifnet *ifp;
> 
>         M_ASSERTPKTHDR(m);
> 
>         if (m->m_pkthdr.rcvif_idx == 0)
>                 return (NULL);
> 
>         ifp = ifnet_byindex_ref(m->m_pkthdr.rcvif_idx);
>         if (ifp == NULL)
>                 return (NULL);
>         if (ifp->if_unique != m->m_pkthdr.rcvif_unique) {
>                 if_rele(ifp);
>                 return (NULL);
>         }
> 
>         return (ifp);
> }
> 
> The size of struct mbuf isn't going to change on amd64:
> 
> @@ -111,7 +111,8 @@
>   * Record/packet header in first mbuf of chain; valid only if M_PKTHDR is set.
>   */
>  struct pkthdr {
> -       struct ifnet    *rcvif;         /* rcv interface */
> +       uint32_t         rcvif_idx;     /* rcv interface index */
> +       uint32_t         rcvif_unique;  /* rcv interface unique id */
> 
> A proof-of-concept patch is available here:
> 
> http://people.freebsd.org/~glebius/patches/pkthdr.rcvif_idx
> 
> It doesn't cover entire kernel, LINT won't compile. It covers kern, netinet,
> netinet6, several interface drivers and netgraph.
> 
> One of the problems is ongoing namespace pollution: not all code that include
> mbuf.h includes if_var.h and vice versa. I've cut ifnet from m_devget(), but
> my new function do declare it in parameter list :(. To deal with that I had
> to declare functions in mbuf.h but implement them in if.c.
> 
> Other problem is net80211 code that abuses the rcvif pointer in mbuf packet
> header for its own purposes casting it to private type. This can be fixed
> utilizing mbuf_tags(9), I think. I haven't made this yet, that's why LINT
> doesn't compile.
> 
>   Comments are requested!
> 
>   Any alternative ideas on solving the problem are welcome!
> 
> -- 
> Totus tuus, Glebius.
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20110714/c9195145/attachment.pgp


More information about the freebsd-net mailing list