Re: Slow WAN traffic to FreeBSD hosts but not to Linux hosts---how to debug/fix?

From: Paul Mather <paul_at_gromit.dlib.vt.edu>
Date: Sat, 04 Feb 2023 18:11:57 UTC
On Feb 1, 2023, at 5:54 PM, David <2yt@gmx.com> wrote:

> On 2/1/23 14:07, Paul Mather wrote:
>> On Feb 1, 2023, at 3:14 PM, Marek Zarychta <zarychtam@plan-b.pwste.edu.pl> wrote:
>>> W dniu 1.02.2023 o 20:33, Paul Mather pisze:
>>>> It looks like we may have a winner, folks.  I built and enabled the extra TCP stacks and for the first time was able to max out my connection to the remote FreeBSD system.  I get consistently higher throughput over the 15-hop WAN path to the remote FreeBSD system when using the RACK TCP stack than when using the default "freebsd" stack.
>>>> 
>>>> Although the speeds are consistently higher when using the setting "net.inet.tcp.functions_default=rack", they are still variable.  However, rather than the 3--4 MB/s I saw that kicked off this thread, I now average over 10 MB/s.
>>>> 
>>>> I actually get the best results with "net.inet.tcp.functions_default=bbr" (having loaded tcp_bbr).  That behaves very much like the Linux hosts in that speeds climb very quickly until it saturates the WAN connection.  I get the same high speeds from the remote FreeBSD system using tcp_bbr as I do to the Linux hosts.  I will stick with tcp_bbr for now as the default on my remote FreeBSD servers.  It appears to put them on a par with Linux for this WAN link.
>>> 
>>> Thanks for the feedback Paul. Please bear in mind that BBR 1 which is implemented in FreeBSD is not a fair[1] congestion control algorithm. Maybe in the future, we will have BBR v2 in the stack, but for now, I don't recommend using BBR, unless you want to act slightly as a hm.... network leecher. Maybe Linux hosts behave this way, maybe they have implemented BBR v2, I am not familiar with Linux TCP stack enhancements. On the other hand, tcp_rack(4) is performant, well-tested in the FreeBSD stack, considered fair and more acceptable for a fileserver, though not ideal, ie. probably more computationally expensive and still missing some features like TCP-MD5.
>>> 
>>> [1] https://www.mdpi.com/1424-8220/21/12/4128
>>> 
>> That is a fair and astute observation, Marek.  I am also not familiar with Linux TCP stack implementations but it had occurred to me that maybe Linux was not being an entirely good netizen whereas FreeBSD was behaving with impeccable net manners when it came to congestion control and being fair to others, and that is why Linux was getting faster speeds for me.  Then again, perhaps not. :-)
>> In the case of the remote FreeBSD hosts I use at $JOB, they have low numbers of users and so are more akin to endpoints than servers, so I'm not worried about "leeching" from them.  Also, my ISP download bandwidth is 1/5th of each FreeBSD system, so hopefully there is still plenty to go around after I max out my bulk downloads.  (Plus, I believe $JOB prefers my downloads to take half [or less] the time.) :-)
>> Hopefully we will get BBR v2 (or something even fairer) at some point.   IIRC, the FreeBSD Foundation has been highlighting some of this network stack work.  It would be a pity for it not to be enabled by default so more people could use it on -RELEASE without building a custom kernel.  I'm just glad right now I'm not stuck with 3--4 MB/s downloads any more.
>> Cheers,
>> Paul.
> 
> Word of caution:
> 
> It would appear not all FreeBSD applications like BBR or RACK. I run a Magento (e-commerce) VM and was getting weird pauses (hang for a bit then resume) on the website. Running Magneto requires several other TCP services and something wasn't happy. Not going to debug the problem, just wanted to give a heads up.


Thanks for the heads-up.  Since posting the above I have also noticed that BBR and RACK aren't unalloyed successes for me.  In my experience, I would also notice pauses and lockups/disconnections to FreeBSD systems with BBR enabled.  I only noticed pauses with RACK on some systems in my testing, so that was much more usable.

The problems I experienced were worst on 13.1-RELEASE.  A 13.1-RELEASE system I built and enabled BBR on was almost unusable to me.  RACK was better but also had issues.  Much better in my tests is BBR and RACK on 13-STABLE and -CURRENT.  I had no issues with RACK (other with more speed variability vs BBR), whereas BBR did lock up one of my FreeBSD clients in testing.  (That client uses a RealTek re NIC, so maybe BBR tickles more bugs in that.)  RACK caused no lockups and yielded good enough speeds for it to be my go-to combo now over BBR.

I figure the better results with enabling BBR and RACK on -STABLE and -CURRENT vs -RELEASE servers reflects potential improvements/bug fixes in the implementation since -RELEASE landed.  (Disclaimer: the -RELEASE test system uses em NICs whereas the -STABLE and -CURRENT systems I used in testing use igb NICs, so maybe that is a factor?)

Cheers,

Paul.