From nobody Thu Jul 18 14:03:33 2024 X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WPvfp3F5tz5RVWg for ; Thu, 18 Jul 2024 14:03:46 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-qt1-f173.google.com (mail-qt1-f173.google.com [209.85.160.173]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4WPvfp1TT5z45V8; Thu, 18 Jul 2024 14:03:46 +0000 (UTC) (envelope-from asomers@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-qt1-f173.google.com with SMTP id d75a77b69052e-449f23df593so1733031cf.1; Thu, 18 Jul 2024 07:03:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721311425; x=1721916225; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Qyz7cdYAeGl5GeyWeUTFUEypZQsAxhGf0ioV8wHg00k=; b=nNU/Z+owltbWZZqAEA5QavioLLoK0fbX/R3yqhX9+ITgIv+KQ667NbpJeTwkXevkug 1YQ6PLo9JyEc90A5ZzRxG+tNT5Nkju/scnQQaOozKGLNYuCRV9pzw50lQ+CqTPdUZzvT x1kOclWl8dPZQxWu5NqjxVoMajq5EME+y0Oj/IkvcvPWqe2DoMVky1Ruj7M9CLlDE+Ie k+H6gkKezR6hfZYz5DTuNjHatSrnO/9HnMZ5QILzdxHc+7dbIT/nT5YKlEhiOQ2EpRX0 DegRr2eWNa6miA4wQeeGSk/qyCHl0HG+0kF0T2GUxXd8AtUAN7NG3JXvR7yFAFvYIdf8 j5gQ== X-Gm-Message-State: AOJu0Yxv2iYvsr+j0IyQSQNk9M5bQ5Ocy6NNH5zqykrQS2IMhCL5ULWO 2opQIAAIVvjRpw/2fdAcaYorLPdV5PDWOy8MSgCil1uoUPYgji97rxnleMJjOa16JHxBCVKEQKK ov3zFBh0ULHbfi5OKWEc61cTJl7eJYw== X-Google-Smtp-Source: AGHT+IHQc/3wZjn3JUkEsWNa2it2emnVnnsE6aRvmRwHhf5yYwsRi32mBoNG6toGDO8GwDWPRbLFgok4AaCAR/6UrUo= X-Received: by 2002:ac8:7e8f:0:b0:447:dc9a:1cea with SMTP id d75a77b69052e-44f96a1bc99mr10323001cf.13.1721311424900; Thu, 18 Jul 2024 07:03:44 -0700 (PDT) List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@FreeBSD.org MIME-Version: 1.0 References: In-Reply-To: From: Alan Somers Date: Thu, 18 Jul 2024 08:03:33 -0600 Message-ID: Subject: Re: TCP Success Story (was Re: TCP_RACK, TCP_BBR, and firewalls) To: tuexen@freebsd.org Cc: FreeBSD Net Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US] X-Rspamd-Queue-Id: 4WPvfp1TT5z45V8 On Wed, Jul 17, 2024 at 2:27=E2=80=AFPM wrote: > > > On 17. Jul 2024, at 22:00, Alan Somers wrote: > > > > On Sat, Jul 13, 2024 at 1:50=E2=80=AFAM wrote: > >> > >>> On 13. Jul 2024, at 01:43, Alan Somers wrote: > >>> > >>> I've been experimenting with RACK and BBR. In my environment, they > >>> can dramatically improve single-stream TCP performance, which is > >>> awesome. But pf interferes. I have to disable pf in order for them > >>> to work at all. > >>> > >>> Is this a known limitation? If not, I will experiment some more to > >>> determine exactly what aspect of my pf configuration is responsible. > >>> If so, can anybody suggest what changes would have to happen to make > >>> the two compatible? > >> A problem with same symptoms was already reported and fixed in > >> https://reviews.freebsd.org/D43769 > >> > >> Which version are you using? > >> > >> Best regards > >> Michael > >>> > >>> -Alan > > > > TLDR; tcp_rack is good, cc_chd is better, and tcp_bbr is best > > > > I want to follow up with the list to post my conclusions. Firstly > > tuexen@ helped me solve my problem: in FreeBSD 14.0 there is a 3-way > > incompatibility between (tcp_bbr || tcp_rack) && lro && pf. I can > > confirm that tcp_bbr works for me if I either disable LRO, disable PF, > > or switch to a 14.1 server. > > > > Here's the real problem: on multiple production servers, downloading > > large files (or ZFS send/recv streams) was slow. After ruling out > > many possible causes, wireshark revealed that the connection was > > suffering about 0.05% packet loss. I don't know the source of that > > packet loss, but I don't believe it to be congestion-related. Along > > with a 54ms RTT, that's a fatal combination for the throughput of > > loss-based congestion control algorithms. According to the Mathis > > Formula [1], I could only expect 1.1 MBps over such a connection. > > That's actually worse than what I saw. With default settings > > (cc_cubic), I averaged 5.6 MBps. Probably Mathis's assumptions are > > outdated, but that's still pretty close for such a simple formula > > that's 27 years old. > > > > So I benchmarked all available congestion control algorithms for > > single download streams. The results are summarized in the table > > below. > > > > Algo Packet Loss Rate Average Throughput > > vegas 0.05% 2.0 MBps > > newreno 0.05% 3.2 MBps > > cubic 0.05% 5.6 MBps > > hd 0.05% 8.6 MBps > > cdg 0.05% 13.5 MBps > > rack 0.04% 14 MBps > > htcp 0.05% 15 MBps > > dctcp 0.05% 15 MBps > > chd 0.05% 17.3 MBps > > bbr 0.05% 29.2 MBps > > cubic 10% 159 kBps > > chd 10% 208 kBps > > bbr 10% 5.7 MBps > > > > RACK seemed to achieve about the same maximum bandwidth as BBR, though > > it took a lot longer to get there. Also, with RACK, wireshark > > reported about 10x as many retransmissions as dropped packets, which > > is suspicious. > > > > At one point, something went haywire and packet loss briefly spiked to > > the neighborhood of 10%. I took advantage of the chaos to repeat my > > measurements. As the table shows, all algorithms sucked under those > > conditions, but BBR sucked impressively less than the others. > > > > Disclaimer: there was significant run-to-run variation; the presented > > results are averages. And I did not attempt to measure packet loss > > exactly for most runs; 0.05% is merely an average of a few selected > > runs. These measurements were taken on a production server running a > > real workload, which introduces noise. Soon I hope to have the > > opportunity to repeat the experiment on an idle server in the same > > environment. > > > > In conclusion, while we'd like to use BBR, we really can't until we > > upgrade to 14.1, which hopefully will be soon. So in the meantime > > we've switched all relevant servers from cubic to chd, and we'll > > reevaluate BBR after the upgrade. > Hi Alan, > > just to be clear: the version of BBR currently implemented is > BBR version 1, which is known to be unfair in certain scenarios. > Google is still working on BBR to address this problem and improve > it in other aspects. But there is no RFC yet and the updates haven't > been implemented yet in FreeBSD. I've also heard that RACK suffers from fairness problems. Do you know how RACK and BBR compare for fairness?