TCP6 regression for MTU path on stable/13

From: Harry Schmalzbauer <freebsd_at_omnilan.de>
Date: Sun, 12 Sep 2021 11:12:24 UTC
Hello,

on one of my production stable/13 setups, MTU definitions from the 
routing table aren't respected (anymore?).
I always had jumbo frames enabled and set a fixed MTU for routed 
destinations.
No issues with stable/13 until april I guess (but I never checked for 
'too big" ICMP6 messages before).
After upgrading from stable/13~april -> stable/13-august,
TCP6 connections suffer from massive MTU induced performance drops.
No issue for ICMP4
For ICMP6, there seems to be a miscalculation anywhere.

Let's start with TCP6:
42:c9:f9:fc:82:02 > 96:07:e9:f9:fc:85, ethertype IPv6 (0x86dd), length 
1294: 2003:a:f43:84a2::1 > 2003:a:f43:84a2::10: ICMP6, packet too big, 
mtu 1492, length 1240

in response to
96:07:e9:f9:fc:85 > 42:c9:f9:fc:82:02, ethertype IPv6 (0x86dd), length 
1770: 2003:a:f43:84a2::10.22 > 2003:a:47f:6ba1::3:1.55102: Flags [.], 
seq 2417:4101, ack 108, win 1030, options [nop,nop,TS val 4168798600 ecr 
1552727989], length 1684


42:c9:f9:fc:82:02
is the next hop at 2003:a:f43:84a2::1, which is routing the packets.
96:07:e9:f9:fc:85
is a SSH server at 2003:a:f43:84a2::10, which responds to a shell command.
2003:a:47f:6ba1::3:1(.55102) is the SSH-client.


96:07:e9:f9:fc:85 / 2003:a:f43:84a2::10 transmitts a packet with length 
1770 to 2003:a:47f:6ba1::3:1.

Which mustn't happen, since:
route -6 get 2003:a:47f:6ba1::3:1
    route to: 2003:a:47f:6ba1::3:1
destination: 2003:a:47f:6ba1::
        mask: ffff:ffff:ffff:ffff::
     gateway: cupid
         fib: 0
   interface: ht0hsm
       flags: <UP,GATEWAY,DONE,STATIC,FIXEDMTU>
  recvpipe  sendpipe  ssthresh  rtt,msec    mtu        weight expire
        0         0         0         0      1492         1 0


With TSO enabled on if_igb(4) (i350), this results in unusable low 
connection bandwidth.
Somehow not yet tracked, the 'packet too big' seem to get lost, and some 
timeout leads to retransmissions...
Result is a *throughput of around 100kBit/s!*

With TSO disabled, the problem still shows up, but impact is by far not 
that big.


Here's an example of a - probably not directly related - ICMP6 oddity, I 
accidentally noticed during hunting the recently nitroduced issue:

ping -s 1453 -D 2003:a:47f:6ba1::3:1
PING6(1501=40+8+1453 bytes) 2003:a:f43:84a2::2:130 --> 2003:a:47f:6ba1::3:1
ping: sendmsg: Message too long

ping -s 1452 -D 2003:a:47f:6ba1::3:1
PING6(1500=40+8+1452 bytes) 2003:a:f43:84a2::2:130 --> 2003:a:47f:6ba1::3:1
no reply, because:
96:07:e9:f9:fc:85 > 42:c9:f9:fc:82:02, ethertype IPv6 (0x86dd), length 
1514: 2003:a:f43:84a2::2:130 > 2003:a:47f:6ba1::3:1: ICMP6, echo request,
seq 0, length 1460
42:c9:f9:fc:82:02 > 96:07:e9:f9:fc:85, ethertype IPv6 (0x86dd), length 
1294: 2003:a:f43:84a2::1 > 2003:a:f43:84a2::2:130: ICMP6, packet too big,
mtu 1492, length 1240

Altering mtu for the v6 route doesn't influence ICMP6 sendmsg behaviour 
at all, limit is always 1500 bytes, no matter what mtu I define!
This is also true for stable/12 from long ago!


Will try to further track it down, but in case anybody has an idea, what 
change during the last view months in stable/13 could have caused this 
real-world problem regarding resulting TCP6 throughput, I'm happy to 
start testing at that point.


Thanks,

-harry