m_move_pkthdr leaves m_nextpkt 'dangling'
Karim Fodil-Lemelin
kfodil-lemelin at xiplink.com
Thu Oct 12 20:43:09 UTC 2017
On 2017-07-07 10:46 AM, Andrey V. Elsukov wrote:
> On 05.07.2017 19:23, Adrian Chadd wrote:
>>> As many of you know, when dealing with IP fragments the kernel will build a
>>> list of packets (fragments) chained together through the m_nextpkt pointer.
>>> This is all good until someone tries to do a M_PREPEND on one of the packet
>>> in the chain and the M_PREPEND has to create an extra mbuf to prepend at the
>>> beginning of the chain.
>>>
>>> When doing so m_move_pkthdr is called to copy the current PKTHDR fields
>>> (tags and flags) to the mbuf that was prepended. The function also does:
>>>
>>> to->m_pkthdr = from->m_pkthdr;
>>>
>>> This, for the case I am interested in, essentially leaves the 'from' mbuf
>>> with a dangling pointer m_nextpkt pointing to the next fragment. While this
>>> is mostly harmless because only mbufs of pkthdr types are supposed to have
>>> m_nextpkt it triggers some panics when running with INVARIANTS in NetGraph
>>> (see ng_base.c :: CHECK_DATA_MBUF(m)):
>>>
>>> ...
>>> if (n->m_nextpkt != NULL) \
>>> panic("%s: m_nextpkt", __func__); \
>>> }
>>> ...
>>>
>>> So I would like to propose the following patch:
>>>
>>> @@ -442,10 +442,11 @@ m_move_pkthdr(struct mbuf *to, struct mbuf *from)
>>> if ((to->m_flags & M_EXT) == 0)
>>> to->m_data = to->m_pktdat;
>>> to->m_pkthdr = from->m_pkthdr; /* especially tags */
>>> SLIST_INIT(&from->m_pkthdr.tags); /* purge tags from src */
>>> from->m_flags &= ~M_PKTHDR;
>>> + from->m_nextpkt = NULL;
>>> }
>>>
>>> It will reset the m_nextpkt so we don't have two mbufs pointing to the same
>>> next packet. This is fairly harmless and solves a problem for us here at
>>> XipLink.
>> This seems like a no-brainer. :-) Any objections?
> I think the change is reasonable.
> But from other side m_demote_pkthdr() may also need this change.
> Maybe we can wait when Gleb will be back and review this? Also he is the
> author of the mentioned assertion in netgraph code.
>
Hi,
Any updates on this one?
There is another interesting patch I would like to share. This is
regarding the m_tag_free function pointer in the m_tag structure.
As it turns out, we use this field (m_tag_free) to track some mbuf tag
at work and, in order to properly do reference counting on it, we had to
modify m_tag_copy() the following way in order to keep the m_tag_free
function pointer to point to the same function the original tag was
pointing to (the code is a lot easier to understand than the text ...).
@@ -437,6 +437,7 @@ m_tag_copy(struct m_tag *t, int how)
} else
#endif
bcopy(t + 1, p + 1, t->m_tag_len); /* Copy the data */
+ p->m_tag_free = t->m_tag_free; /* copy the 'free' function pointer */
return p;
}
This is because m_tag_copy uses m_tag_alloc() that resets the m_tag_free
pointer to m_tag_free_default. It would be nice if this could make its
way into the mbuf tag base code.
Best regards,
Karim.
More information about the freebsd-net
mailing list