Bug in sbsndptr()

Lawrence Stewart lstewart at freebsd.org
Wed Mar 6 01:54:18 UTC 2013


On 03/06/13 01:03, Andre Oppermann wrote:
> On 05.03.2013 04:21, Lawrence Stewart wrote:
>> On 03/05/13 03:35, Andre Oppermann wrote:
>>> On 26.02.2013 14:38, Lawrence Stewart wrote:
>>>> Hi Andre,
>>>
>>> Hi Lawrence, :-)
>>>
>>>> A colleague and I spent a very frustrating day tracing an accounting
>>>> bug
>>>> in the multipath TCP patch we're working on at CAIA to a bug in
>>>> sbsndptr(). I haven't tested it with regular TCP yet, but I believe the
>>>> following patch fixes the bug (proposed commit log message is at the
>>>> top
>>>> of the patch):
>>>>
>>>> http://people.freebsd.org/~lstewart/patches/misctcp/sbsndptr_mnext_10.x.r247314.diff
>>>>
>>>>
>>>>
>>>> The patch should have no tangible effect to operation other than to
>>>> ensure the function delivers on the promise to return the closest mbuf
>>>> in the chain for the given offset.
>>>
>>> I agree that the description of sbsndptr() can be misleading as it
>>> refers
>>> to the point in time when the pointer was updated last.  Relative to now
>>> the real offset may be at the beginning of the next mbuf.
>>
>> Right, and we ran into the issue because we made an assumption based on
>> the use of the present tense in the comment:
>>
>>      "Return closest mbuf in chain for current offset."
> 
> I apologize for the incorrect and misleading description. :-)

No drama, just explaining the crux of the problem from our perspective
so it's clear why we ran into this.

>>> As you note in the proposed commit message by the time the send pointer
>>> is calculated we may have reached the end of the chain and must avoid
>>> storing a NULL pointer.  The mbuf copy routines simply skips over the
>>> additional mbuf in the chain using the returned offset.
>>>
>>> I wonder how this has caused trouble with your multipath patch.  You'd
>>> have to copy the sockbuf contents as well and unless you're using custom
>>> sockbuf and mbuf chain functions this shouldn't be a problem.  Using
>>> custom functions on a socket buffer is a delicate approach.  For a
>>> sockbuf
>>> consumer being able to handle valid offsets into an mbuf chain is a core
>>> feature and must-have part of the functionality.
>>
>> No custom sockbuf or mbuf routines are in use. We've implemented a
>> mapping shim between subflows and the socket buffer. When a subflow asks
>> the multipath layer for some data to send, the multipath layer returns a
>> mapping onto the socket buffer, which will remain valid until such time
>> as the subflow has marked the mapped data as acknowledged.
>>
>> Part of the map accounting is tracking the pointer of the first mbuf in
>> the sockbuf where the map's data begins. Our accounting assumed the mbuf
>> + the offset returned by sbsndptr had data available, which is how we
>> triggered the problem. We could have accounted for the issue in our new
>> map accounting code, but that would add additional complexity to some
>> already complex code and the better solution is to make sbsndptr DTRT.
> 
> So effectively you run a separate sbsndptr for each subflow using the
> real sbsndptr to track the head of the queue?

Yes, essentially works as you describe. The initial goal/design was to
make multi-stream support a first class citizen inside the socket
buffer, but we ran out of time to do this. The design we've come up with
is a reasonable interim to get to an alpha patch release, which should
be happening later this week if you're interested to take a look. We'll
make an announcement when it's up on the website.

> /me fears the day a mptcp import comes up.  tcp-complexity^^3. :-o

Yeah it's pretty invasive but does bring some useful features too. There
is a lot more work to do before I'd consider proposing we import it into
the stack and even then, we'll want to have a robust discussion about
when and how to do it.

Given that this is being done as part of a research project, we've also
taken the opportunity to experiment with changing some ideas and
idiosyncrasies in the existing stack code and will be doing a lot of
experimental research with the stack and iteratively refining things as
we go.

>>>> I would appreciate a review and any thoughts.
>>>
>>> I think you have found a valid (micro-)optimization.  However you're
>>> still making a dangerous assumption in that the next mbuf is indeed
>>> the one you want.  This may not be true in subtle ways when the chain
>>> contains m_len=0 mbufs in it.  I'm not aware of it actually happening
>>> but it can't be ruled out either if custom sockbuf manipulation
>>> functions
>>> are in use.
>>
>> True, though I'm struggling to think why there would be m_len=0 mbufs
>> interspersed with m_len > 0 mbufs in a socket send buffer mbuf chain.
> 
> sbcompress() doesn't allow for m_len=0 mbufs.  This holds true as long
> as the sbappend functions are used.  If not, we may get anything there.
> As long as nobody is using custom sockbuf appends we're safe.  Because
> I first assumed from your description some custom sockbuf munging the
> guarantee wouldn't haven been there anymore.

Ok cool.

>>> I'd recommend the following:
>>> have you custom sockbuf function handle forward seeking like the other
>>> m_copy() functions; and/or apply a patch along the (untested) example
>>> below.
>>
>> If you believe it is both correct and possible for m_len=0 mbufs to
>> exist in a socket buffer chain, then I agree that we should amend my
>> proposed patch to loop and skip over m_len=0 mbufs as you've suggested.
> 
> No.  So far it is neither possible or correct.
> 
>> However, I'm more inclined to suspect it is undesirable and potentially
>> buggy behaviour to end up with m_len=0 mbufs in a socket buffer chain on
>> which sbsndptr is being used, and would instead suggest a
>> "KASSERT(ret->m_len > 0, (...));" be added to the end of my proposed if
>> block.
> 
> Agreed.

How does this look?

http://people.freebsd.org/~lstewart/patches/misctcp/sbsndptr_mnext_10.x.r247314_v2.diff

Sockbuf code is tricky so I'll test this for a while and commit after it
has had a reasonable run and not shown any side effects.

Cheers,
Lawrence


More information about the freebsd-net mailing list