Re: ssh connections break with "Fssh_packet_write_wait" on 13 [SOLVED]

From: Michael Gmelin <freebsd_at_grem.de>
Date: Tue, 08 Jun 2021 20:47:25 UTC

On Thu, 3 Jun 2021 15:09:06 +0200
Michael Gmelin <freebsd@grem.de> wrote:

> On Tue, 1 Jun 2021 13:47:47 +0200
> Michael Gmelin <freebsd@grem.de> wrote:
> 
> > Hi,
> > 
> > Since upgrading servers from 12.2 to 13.0, I get
> > 
> >   Fssh_packet_write_wait: Connection to 1.2.3.4 port 22: Broken pipe
> > 
> > consistently, usually after about 11 idle minutes, that's with and
> > without pf enabled. Client (11.4 in a VM) wasn't altered.
> > 
> > Verbose logging (client and server side) doesn't show anything
> > special when the connection breaks. In the past, QoS problems
> > caused these disconnects, but I didn't see anything apparent
> > changing between 12.2 and 13 in this respect.
> > 
> > I did a test on a newly commissioned server to rule out other
> > factors (so, same client connections, some routes, same
> > everything). On 12.2 before the update: Connection stays open for
> > hours. After the update (same server): connections breaks
> > consistently after < 15 minutes (this is with unaltered
> > configurations, no *AliveInterval configured on either side of the
> > connection). 
> 
> I did a little bit more testing and realized that the problem goes
> away when I disable "Proportional Rate Reduction per RFC 6937" on the
> server side:
> 
>   sysctl net.inet.tcp.do_prr=0
> 
> Keeping it on and enabling net.inet.tcp.do_prr_conservative doesn't
> fix the problem.
> 
> This seems to be specific to Parallels. After some more digging, I
> realized that Parallels Desktop's NAT daemon (prl_naptd) handles
> keep-alive between the VM and the external server on its own. There is
> no direct communication between the client and the server. This means:
> 
> - The NAT daemon starts sending keep-alive packages right away (not
>   after the VM's net.inet.tcp.keepidle), every 75 seconds.
> - Keep-alive packages originating in the VM never reach the server.
> - Keep-alive originating on the server never reaches the VM.
> - Client and server basically do keep-alive with the nat daemon, not
>   with each other.
> 
> It also seems like Parallels is filtering the tos field (so it's
> always 0x00), but that's unrelated.
> 
> I configured a bhyve VM running FreeBSD 11.4 on a separate laptop on
> the same network for comparison and is has no such issues.
> 
> Looking at TCP dump output on the server, this is what a keep-alive
> package sent by Parallels looks like:
> 
>   10:14:42.449681 IP (tos 0x0, ttl 64, id 15689, offset 0, flags
> [none], proto TCP (6), length 40)
>     192.168.1.1.58222 > 192.168.1.2.22: Flags [.], cksum x (correct),
>     seq 2534, ack 3851, win 4096, length 0
> 
> While those originating from the bhyve VM (after lowering
> net.inet.tcp.keepidle) look like this:
> 
>   12:18:43.105460 IP (tos 0x0, ttl 62, id 0, offset 0, flags [DF],
>     proto TCP (6), length 52)
>     192.168.1.3.57555 > 192.168.1.2.22: Flags [.], cksum x
>     (correct), seq 1780337696, ack 45831723, win 1026, options
>     [nop,nop,TS val 3003646737 ecr 3331923346], length 0
> 
> Like written above, once net.inet.tcp.do_prr is disabled, keepalive
> seems to be working just fine. Otherwise, Parallel's NAT daemon kills
> the connection, as its keep-alive requests are not answered (well,
> that's what I think is happening):
> 
>   10:19:43.614803 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF],
>     proto TCP (6), length 40)
>     192.168.1.1.58222 > 192.168.1.2.22: Flags [R.], cksum x (correct),
>     seq 2535, ack 3851, win 4096, length 0
> 
> The easiest way to work around the problem Client side is to configure
> ServerAliveInterval in ~/.ssh/config in the Client VM.
> 
> I'm curious though if this is basically a Parallels problem that has
> only been exposed by PRR being more correct (which is what I suspect),
> or if this is actually a FreeBSD problem.
> 

So, PRR probably was a red herring and the real reason that's happening
is that FreeBSD (since version 13[0]) by default discards packets
without timestamps for connections that formally had negotiated to have
them. This new behavior seems to be in line with RFC 7323, section
3.2[1]:

    "Once TSopt has been successfully negotiated, that is both <SYN> and
    <SYN,ACK> contain TSopt, the TSopt MUST be sent in every non-<RST>
    segment for the duration of the connection, and SHOULD be sent in an
    <RST> segment (see Section 5.2 for details)."

As it turns out, macOS does exactly this - send keep-alive packets
without a timestamp for connections that were negotiated to have them.

Under normal circumstances - ssh from macOS to a server running FreeBSD
13 - this won't be noticed, since macOS uses the same default settings
as FreeBSD (2 hours idle time, 75 seconds intervals), so the server
side initiated keep-alive will save the connection before it has a
chance to break due to eight consecutive unanswered keep-alives at the
client side.

This is different for ssh connections originating from a VM inside
Parallels, as connections created by prl_naptd will start sending tcp
keep-alives shortly after the connection becomes idle. As a result,
idle connections break after about 11 minutes of idle time (60s
+ 8*75s = 660s == 11m), unless countermeasures are taken.

An easy way to demonstrate the problem is to change keep-alive defaults
on *macOS* using sysctl and sshing to a FreeBSD 13 server:

    $ sudo sysctl net.inet.tcp.keepidle=5000
    $ sudo sysctl net.inet.tcp.keepintvl=5000
    $ ssh -oTCPKeepAlive=yes myserver

This way, the problem described can be reproduced quite easily:
Disconnect due to broken pipe after 45-60 seconds of idle time, tcpdump
confirming that keep-alive packets don't have tcp timestamps, while
they were used when negotiating the connection.

There are various ways to work around the issue.

Client side workarounds:
- Use ServerAlive* settings in ~/.ssh/config (ssh only)
- Tune net.inet.tcp.keep* sysctls on macOS (for all services)

Server side workarounds:
- Use ClientAlive* settings in ~/.ssh/config (ssh only)
- Tolerate missing timestamps in packets using sysctl, which makes
  FreeBSD 13 behave like previous versions did:

    sysctl net.inet.tcp.tolerate_missing_ts=1

The last option probably being the most practical one.

rscheff@ and tuexen@ (thank you!) were able to reproduce the issue and
reached out to Apple to see if there is something they can do to fix
this at their end (macOS) in the future.

Best
Michael

[0]https://cgit.freebsd.org/src/commit/?id=283c76c7c3f2f634f19f303a771a3f81fe890cab
[1]https://datatracker.ietf.org/doc/html/rfc7323#section-3.2

-- 
Michael Gmelin