Re: IPv6 inflight fragmentation

From: Andrey V. Elsukov <bu7cher_at_yandex.ru>
Date: Tue, 02 Nov 2021 13:06:02 UTC
01.11.2021 23:56, Peter пишет:
> ! divert rule does implicit IP fragments reassembling before passing a
> ! packet to application. I don't think dummynet is affected by this.
> 
> No, we're not going to an application, we are routing to the
> Internet. And the uplink iface (tun0) has mtu=1492. And we have a rule
> in ipfw, like:
> 
>> queue 21 proto all <whatever> xmit tun0 out
> 
> And we have sysctl net.inet.ip.fw.one_pass=0
> 
> So, at the time when we go thru the queue, we do not yet know the
> actual interface to use for xmit (because there might still be a
> "forward" rule following), so we do not yet know the mtu.
> 
> Only when we finally give the packet out for sending, *after* passing
> the queue, then we will recognize our actual mtu. And then the
> difference happens:
> 
>  * if we did *not* go through the queue, the packet is (probably)
>    dropped and an ICMPv6 type 2 ("too big") is sent back to the
>    originator. This is how I understand that it should work, and
>    that works.

Hi,

without divert/dummynet rules packets are handled trough usual
forwarding path. I.e. ip6_tryforward() handles MTU case and sends
ICMP6_PACKET_TOO_BIG message.

>  * if we *did* go through the queue, the packet is split into
>    fragments although it is IPv6. And that does not work; such packet
>    does not get answered by Youtube, and playback hangs. From a quick
>    glance the fragments do look technically correct - and I have no
>    idea why YT would receive a fullsized packet from the player,
>    anyway (and I won't analyze their stuff).

And there it seems we have the problem. When you use dummynet rule with
"out xmit" opcode, it is handled on PFIL_OUT|PFIL_FWD pass. And as the
result, dummynet consumes a packet and sends it to ip6_output() with
IPV6_FORWARDING flag. Currently this flag does make some sense only for
multicast routing. And there is the problem - the router that uses
dummynet rule for forwarded packet can do IP fragmentation, that it must
not do.

Alexander and Bjoern, can you take a look at this?
I made a quick patch that does check for PFIL_FWD and IPV6_FORWARDING
flags. So, dummynet now knows that we are forwarding and sets
IPV6_FORWARDING only in that case. Then ip6_output() does set dontfrag
variable when we have IPV6_FORWARDING. And in the end, when we got
EMSGSIZE error with IPV6_FORWARDING flag, we send ICMP6_PACKET_TOO_BIG
error message instead of quiet dropping.

https://people.freebsd.org/~ae/ip6_dont_frag.diff

The patch doesn't touch divert code. I think diverted packet can be
assumed as locally generated, so it is ok to fragment it.

> The behaviour is the same if there is either a "queue" action or
> a "divert" action or both.
> With "divert" we know that the mbuf flags are lost - with dummynet
> I did not yet look into the code. I had a hard time finding the cause
> in bulky video data, and then I simply reduced the mtu one hop earlier
> within my intranet, to workaround the issue for now.

As a workaround usually it is enough to use tcp-setmss opcode.

-- 
WBR, Andrey V. Elsukov