sl2tps, MRU, MTU, and MSS

Brian Candler B.Candler at pobox.com
Sat Jan 28 02:45:00 PST 2006


On Fri, Jan 27, 2006 at 10:39:08AM -0600, Archie Cobbs wrote:
> First of all, let's be clear about terminology.. there are two different
> MRU's negoatiated in opposite directions and those negoations are done
> independently. The problem, which is basically "the FreeBSD->WinXP MTU
> is causing a PIX-bug-triggering MSS in the WinXP->FreeBSD direction"
> arises because:
> 
>  - WinXP sets its MSS (which applies to data flowing in the FreeBSD->WinXP
>    direction) based on the MTU that it sees (which applies to the
>    WinXP->FreeBSD direction). This is a heuristic "guess" made by the
>    TCP stack, based on the assumption that the link is MTU-symmetrical.
>  - This "guess" is wrong and because of path-MTU problems can't be
>    corrected.

I have to admit that it's a long time since I used dial-up PPP, so I'm very
rusty on all of this :-)

I thought the L2TP link was symmetric in my test (i.e. both sides accepted
1400) but I don't have the logs to hand, so I'll have to check that when I'm
next in the office.

As an observation: when you ifconfig ng0, you can't set separate "transmit
MTU" and "receive MTU". So I imagine that the configured MTU only applies to
outbound datagrams, i.e. it means "don't transmit any datagram larger than
this on this interface". If ng0 were to *receive* a datagram larger than the
MTU I don't know for sure what would happen, but given that it was
successfully received, I see no reason why the kernel should discard it.

So the ng0 MTU should just match the requested MRU from the WinXP side, and
all will be well in that direction. Equally, the WinXP interface MTU should
match the MRU requested by FreeBSD.

> In any case, in the FreeBSD -> WinXP direction, you say we could send 1400
> byte packets out the ng0 interface, but this is not necessarily true. What
> is the MRU that the WinXP machine asked for? If it's 1400, then the ng0
> interface must definitely be < 1400, because of PPP overhead (e.g., IPCP).
> The 1400 negiotiated by LCP applies to PPP frame payload, not IP size.

I think you're mistaken; see here in RFC 1661

| 2.  PPP Encapsulation
| ...
|            +----------+-------------+---------+
|            | Protocol | Information | Padding |
|            | 8/16 bits|      *      |    *    |
|            +----------+-------------+---------+
| ...
|       The maximum length for the Information field, including Padding,
|       but not including the Protocol field, is termed the Maximum
|       Receive Unit (MRU), which defaults to 1500 octets.  By
|       negotiation, consenting PPP implementations may use other values
|       for the MRU.

And explicitly in RFC 1332 (IPCP):

|    The maximum length of an IP packet transmitted over a PPP link is the
|    same as the maximum length of the Information field of a PPP data
|    link layer frame.

So, a PPP MRU of 1400 lets you send an IP datagram of size 1400.

Of course, IPCP is just a control protocol for negotiating IP options. There
is no "IPCP overhead" when encapsulating an IP datagram, since the IPCP
exchange has already finished.

In any case, I didn't see anything in libpdel which tried to make allowance
for a smaller IP MTU than the PPP MRU, except the WINXP_HACK

> Seems like the proper workaround would be to configure sl2tps to negotiate
> a smaller MRU (WinXP->FreeBSD direction) than 1400. There's no config
> knob for this but one could be added. Then WinXP would "guess" better.

Bear with me while I try to understand this. We have two independent
channels, as you say.

1. For the flow of packets from FreeBSD to WinXP:

     WinXP <------------ FreeBSD <----------- rest of world

1a. WinXP asks for MRU of 1400
1b. FreeBSD accepts this
1c. FreeBSD configures the MTU in this direction as 1376, for its own
    reasons

2. For the flow of packets from WinXP to FreeBSD:

     WinXP ------------> FreeBSD -----------> rest of world
    
2a. FreeBSD asks for MRU of 1400
2b. WinXP accepts this
2c. I presume WinXP configures the MTU in this direction as 1400, although
    I don't know how to confirm this (i.e. ipconfig doesn't show the MTU)

Now: when opening a TCP connection to the outside world, Windows proposes
an MSS of (2c)-40, since there are 40 bytes of IP+TCP headers.

So yes you're right, if FreeBSD is going to choose an MTU of 1376 in step
1c, then it could propose an MRU of 1376 in step 2a, so that Windows would
choose an MSS of 1376-40.

However I don't see how it could do this (easily), since it would have to
wait until it has finished negotiating the MRU from WinXP (step 1a/1b)
before it could even offer an MRU in the opposite direction (step 2a).

This does seem to be a lot of hoops to jump through, when you could simply
fix step 1c: if the WinXP machine says it can receive 1400-byte datagrams,
then configure the interface to send it datagrams of up to 1400 bytes!

> >Besides which, any PPP implementation which announces an MRU of X but then
> >refuses to receive packets of size X is so totally broken that it defeats
> >the object of PPP option negotiation in the first place.
> 
> You can remove that hack, but the hack is not the reason for the failure
> so to speak. It just happens to trigger the problem (which occurs 
> elsewhere).

Hmm, maybe. If you follow the letter of the RFCs, then if an implementation
says it will accept an MRU of X, then you are no under obligation to send it
datagrams of size X. But it's rather pointless to get your IP stack to
fragment (or reject) datagrams of size smaller than X. In this case, WinXP
said it could accept datagrams of up to 1400, but FreeBSD has decided that
any incoming datagram between 1377 and 1400 bytes needs to be fragmented or
returned to sender, and this is pointless. As you say, it does trigger the
path MTU problem elsewhere in the network, but even if path MTU were working
correctly, it would result in a sub-optimal choice of MSS.

(Aside: RFC 1661 section 6.1 says that if an implementation asks for an MRU
of less than 1500, it MUST still be able to receive packets of size 1500. So
we could legitimately ifconfig ng0 to 1500 for all MRU smaller than 1500!)

> The hack itself should probably be turned into a config knob too.

Well I think we agree there :-)

Regards,

Brian.


More information about the freebsd-net mailing list