From nobody Thu Jul 18 16:34:12 2024 X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WPz0W0TBcz5RhH2 for ; Thu, 18 Jul 2024 16:34:19 +0000 (UTC) (envelope-from tuexen@freebsd.org) Received: from drew.franken.de (mail-n.franken.de [193.175.24.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.franken.de", Issuer "Sectigo RSA Domain Validation Secure Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4WPz0V1x0zz4Pl2; Thu, 18 Jul 2024 16:34:18 +0000 (UTC) (envelope-from tuexen@freebsd.org) Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=freebsd.org (policy=none); spf=softfail (mx1.freebsd.org: 193.175.24.27 is neither permitted nor denied by domain of tuexen@freebsd.org) smtp.mailfrom=tuexen@freebsd.org Received: from smtpclient.apple (unknown [IPv6:2a02:8109:1140:c3d:25ea:b605:4a77:64ed]) (Authenticated sender: macmic) by drew.franken.de (Postfix) with ESMTPSA id 04EF4721E2817; Thu, 18 Jul 2024 18:34:13 +0200 (CEST) Content-Type: text/plain; charset=utf-8 List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@FreeBSD.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.600.62\)) Subject: Re: TCP Success Story (was Re: TCP_RACK, TCP_BBR, and firewalls) From: tuexen@freebsd.org In-Reply-To: Date: Thu, 18 Jul 2024 18:34:12 +0200 Cc: Alan Somers , FreeBSD Net Content-Transfer-Encoding: quoted-printable Message-Id: References: To: Junho Choi X-Mailer: Apple Mail (2.3774.600.62) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=disabled version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail-n.franken.de X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.09 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-0.999]; NEURAL_HAM_SHORT(-1.00)[-0.996]; MIME_GOOD(-0.10)[text/plain]; DMARC_POLICY_SOFTFAIL(0.10)[freebsd.org : No valid SPF, No valid DKIM,none]; RCVD_IN_DNSWL_LOW(-0.10)[193.175.24.27:from]; FROM_NO_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; FROM_EQ_ENVFROM(0.00)[]; RCVD_COUNT_ONE(0.00)[1]; RCVD_TLS_ALL(0.00)[]; FREEMAIL_TO(0.00)[gmail.com]; TO_DN_ALL(0.00)[]; FREEFALL_USER(0.00)[tuexen]; ASN(0.00)[asn:680, ipnet:193.174.0.0/15, country:DE]; RCVD_VIA_SMTP_AUTH(0.00)[]; MLMMJ_DEST(0.00)[freebsd-net@freebsd.org]; TO_MATCH_ENVRCPT_SOME(0.00)[]; R_SPF_SOFTFAIL(0.00)[~all:c]; ARC_NA(0.00)[]; R_DKIM_NA(0.00)[]; APPLE_MAILER_COMMON(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; TAGGED_RCPT(0.00)[]; MIME_TRACE(0.00)[0:+] X-Rspamd-Queue-Id: 4WPz0V1x0zz4Pl2 > On 18. Jul 2024, at 15:00, Junho Choi wrote: >=20 > Alan - this is a great result to see. Thanks for experimenting. >=20 > Just curious why bbr and rack don't co-exist? Those are two separate = things. > Is it a current bug or by design? Technically RACK and BBR can coexist. The problem was with pf and/or = LRO. But this is all fixed now in 14.1 and head. Best regards Michael >=20 > BR, >=20 > On Thu, Jul 18, 2024 at 5:27=E2=80=AFAM wrote: >> On 17. Jul 2024, at 22:00, Alan Somers wrote: >>=20 >> On Sat, Jul 13, 2024 at 1:50=E2=80=AFAM wrote: >>>=20 >>>> On 13. Jul 2024, at 01:43, Alan Somers wrote: >>>>=20 >>>> I've been experimenting with RACK and BBR. In my environment, they >>>> can dramatically improve single-stream TCP performance, which is >>>> awesome. But pf interferes. I have to disable pf in order for = them >>>> to work at all. >>>>=20 >>>> Is this a known limitation? If not, I will experiment some more to >>>> determine exactly what aspect of my pf configuration is = responsible. >>>> If so, can anybody suggest what changes would have to happen to = make >>>> the two compatible? >>> A problem with same symptoms was already reported and fixed in >>> https://reviews.freebsd.org/D43769 >>>=20 >>> Which version are you using? >>>=20 >>> Best regards >>> Michael >>>>=20 >>>> -Alan >>=20 >> TLDR; tcp_rack is good, cc_chd is better, and tcp_bbr is best >>=20 >> I want to follow up with the list to post my conclusions. Firstly >> tuexen@ helped me solve my problem: in FreeBSD 14.0 there is a 3-way >> incompatibility between (tcp_bbr || tcp_rack) && lro && pf. I can >> confirm that tcp_bbr works for me if I either disable LRO, disable = PF, >> or switch to a 14.1 server. >>=20 >> Here's the real problem: on multiple production servers, downloading >> large files (or ZFS send/recv streams) was slow. After ruling out >> many possible causes, wireshark revealed that the connection was >> suffering about 0.05% packet loss. I don't know the source of that >> packet loss, but I don't believe it to be congestion-related. Along >> with a 54ms RTT, that's a fatal combination for the throughput of >> loss-based congestion control algorithms. According to the Mathis >> Formula [1], I could only expect 1.1 MBps over such a connection. >> That's actually worse than what I saw. With default settings >> (cc_cubic), I averaged 5.6 MBps. Probably Mathis's assumptions are >> outdated, but that's still pretty close for such a simple formula >> that's 27 years old. >>=20 >> So I benchmarked all available congestion control algorithms for >> single download streams. The results are summarized in the table >> below. >>=20 >> Algo Packet Loss Rate Average Throughput >> vegas 0.05% 2.0 MBps >> newreno 0.05% 3.2 MBps >> cubic 0.05% 5.6 MBps >> hd 0.05% 8.6 MBps >> cdg 0.05% 13.5 MBps >> rack 0.04% 14 MBps >> htcp 0.05% 15 MBps >> dctcp 0.05% 15 MBps >> chd 0.05% 17.3 MBps >> bbr 0.05% 29.2 MBps >> cubic 10% 159 kBps >> chd 10% 208 kBps >> bbr 10% 5.7 MBps >>=20 >> RACK seemed to achieve about the same maximum bandwidth as BBR, = though >> it took a lot longer to get there. Also, with RACK, wireshark >> reported about 10x as many retransmissions as dropped packets, which >> is suspicious. >>=20 >> At one point, something went haywire and packet loss briefly spiked = to >> the neighborhood of 10%. I took advantage of the chaos to repeat my >> measurements. As the table shows, all algorithms sucked under those >> conditions, but BBR sucked impressively less than the others. >>=20 >> Disclaimer: there was significant run-to-run variation; the presented >> results are averages. And I did not attempt to measure packet loss >> exactly for most runs; 0.05% is merely an average of a few selected >> runs. These measurements were taken on a production server running a >> real workload, which introduces noise. Soon I hope to have the >> opportunity to repeat the experiment on an idle server in the same >> environment. >>=20 >> In conclusion, while we'd like to use BBR, we really can't until we >> upgrade to 14.1, which hopefully will be soon. So in the meantime >> we've switched all relevant servers from cubic to chd, and we'll >> reevaluate BBR after the upgrade. > Hi Alan, >=20 > just to be clear: the version of BBR currently implemented is > BBR version 1, which is known to be unfair in certain scenarios. > Google is still working on BBR to address this problem and improve > it in other aspects. But there is no RFC yet and the updates haven't > been implemented yet in FreeBSD. >=20 > Best regards > Michael >>=20 >> [1]: https://www.slac.stanford.edu/comp/net/wan-mon/thru-vs-loss.html >>=20 >> -Alan >=20 >=20 >=20 >=20 > --=20 > Junho Choi | https://saturnsoft.net