Re: Problem with net.inet.tcp.path_mtu_discovery=1
- In reply to: Christos Chatzaras : "Re: Problem with net.inet.tcp.path_mtu_discovery=1"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 05 Jun 2025 05:04:44 UTC
> On 4. Jun 2025, at 19:29, Christos Chatzaras <chris@cretaforce.gr> wrote: > > > >> On 4 Jun 2025, at 19:36, Dave Cottlehuber <dch@skunkwerks.at> wrote: >> >> On Wed, 4 Jun 2025, at 16:36, Christos Chatzaras wrote: >>> Hello, >>> >>> I manage some servers hosting websites. >> >> What does tcpdump/wireshark show for traffic, particularly icmp? Wireshark is very helpful in explaining some issues. >> >> What is the actual MTU on the working net vs the failing one? >> >> Is there a local MTU where the failing websites start working again? >> >> see ping(8) and use -v -D -s …. together to find a working MTU and cross check with tcpdump to find where things seem to break. >> >> On a recent cloud environment I needed to add ‘ set reassemble yes no-df’ to my pf.conf to address MTU issues between VNET jails and the internet. >> >> Happy hunting >> Dave >> > > First, I reverted the server settings to their defaults: > sysctl net.inet.tcp.path_mtu_discovery=1 > sysctl net.inet.tcp.pmtud_blackhole_detection=0 > > Next, I set the MTU on my local computer to 1460 and everything worked as expected: > tcpdump: listening on en0, link-type EN10MB (Ethernet), snapshot length 524288 bytes > 20:15:05.651375 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 64) > 192.168.2.18.65322 > 94.130.217.87.443: Flags [S], cksum 0x293e (correct), seq 3503095669, win 65535, options [mss 1420,nop,wscale 6,nop,nop,TS val 639376397 ecr 0,sackOK,eol], length 0 > 20:15:05.705913 IP (tos 0x0, ttl 54, id 0, offset 0, flags [DF], proto TCP (6), length 60) > 94.130.217.87.443 > 192.168.2.18.65322: Flags [S.], cksum 0x9c22 (correct), seq 3647364942, ack 3503095670, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 1782053626 ecr 639376397], length 0 > > However, when I set my local computer’s MTU back to 1500 (the default), the issue reappeared: > tcpdump: listening on en0, link-type EN10MB (Ethernet), snapshot length 524288 bytes > 20:17:45.662993 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 64) > 192.168.2.18.65333 > 94.130.217.87.443: Flags [S], cksum 0x4a07 (correct), seq 3674289142, win 65535, options [mss 1460,nop,wscale 6,nop,nop,TS val 681359835 ecr 0,sackOK,eol], length 0 > 20:17:45.726988 IP (tos 0x0, ttl 54, id 0, offset 0, flags [DF], proto TCP (6), length 60) > 94.130.217.87.443 > 192.168.2.18.65333: Flags [S.], cksum 0x9b1d (correct), seq 1443843488, ack 3674289143, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 2890559459 ecr 681359835], length 0 > > So, with local computer MTU 1460, everything works, but with MTU 1500, the problem persists. The difference is that you announce a smaller MSS in SYN segment you sent. This means that the peer can only send you smaller TCP segments. So there seems to be a problem if the peer sends too large TCP segments. That means that the peer must do PMTUD or TCP blackhole detection, not the local node. Best regards Michael