vmx: strange issue, related to to tso?

Andriy Gapon avg at FreeBSD.org
Fri Dec 27 22:01:06 UTC 2019


On 27/12/2019 15:34, Vincenzo Maffione wrote:
> It may be useful to check what happens if you replace the vmx0 interface with an
> em0.
> In this way you would know if the issue is vmx-specific or not.

I'll put this on my to-do, can't test right now.

But one thing I noticed when comparing the TCP control block of the connection
before and after the "TSO dance" is that TF_TSO gets cleared after any outgoing
traffic while TSO is disabled on the interface.  And the flag does not come back
after TSO is reenabled.  Any new connections get the flag, of course.

So, I indeed suspect that there is a problem with vmx TSO.
As another data point, an older system from before vmx->iflib conversion does
not exhibit the problem.

> Il giorno gio 26 dic 2019 alle ore 20:04 Andriy Gapon <avg at freebsd.org
> <mailto:avg at freebsd.org>> ha scritto:
> 
> 
>     Maybe someone would have any pointers for me with the following problem.
>     This happens with CURRENT as of the beginning of September.
>     I connect via ssh to a VM running on VMware, it has a single vmx0 interface.
>     The problem is that when I print a moderately large amount of text to the
>     terminal (e.g., tail -100 /var/log/messages) I literally see it printed in
>     chunks with noticeable pauses between chunks.  It takes several seconds for all
>     lines to get shown.  This happens every time I do it.
>     There is an interesting twist.  If I disable TSO with ifconfig vmx0 -tso and
>     print the same output in the same ssh session, then the output is smooth and
>     fast as I would expect it.  The lines scroll by almost instantly.
>     If then I re-enable TSO and again produce the same output in the same ssh, then
>     it is still fast.
> 
>     It appears that the TCP connection gets tuned to some very sub-optimal
>     parameters when TSO is enabled.  When I disable TSO, the parameters get re-tuned
>     to better values and the values stick when I re-enable TSO.
>     This is just a conjecture, of course.
> 
>     I have some tcpdump captures, but I do not see anything that would really stand
>     out.  One difference is that in the slow case only "full sized" packets are sent
>     while in the fast case there are shorter packets with push flag.
> 
>     Some packets for the slow case:
>      00:00:00.453202 IP 10.180.106.180.22 > 10.180.1.29.25490: Flags [.], seq
>     37:1485, ack 36, win 128, options [nop,nop,TS val 1403195134 ecr 4966311],
>     length 1448
>      00:00:00.096859 IP 10.180.1.29.25490 > 10.180.106.180.22: Flags [.], ack 1485,
>     win 1026, options [nop,nop,TS val 4966864 ecr 1403195134], length 0
>      00:00:00.442963 IP 10.180.106.180.22 > 10.180.1.29.25490: Flags [.], seq
>     1485:2933, ack 36, win 128, options [nop,nop,TS val 1403195664 ecr 4966864],
>     length 1448
>      00:00:00.092677 IP 10.180.1.29.25490 > 10.180.106.180.22: Flags [.], ack 2933,
>     win 1026, options [nop,nop,TS val 4967400 ecr 1403195664], length 0
>      00:00:00.437336 IP 10.180.106.180.22 > 10.180.1.29.25490: Flags [.], seq
>     2933:4381, ack 36, win 128, options [nop,nop,TS val 1403196194 ecr 4967400],
>     length 1448
>      00:00:00.097190 IP 10.180.1.29.25490 > 10.180.106.180.22: Flags [.], ack 4381,
>     win 1026, options [nop,nop,TS val 4967934 ecr 1403196194], length 0
> 
>     Some packets after the TSO dance:
>      00:00:00.000450 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags [.], seq
>     4077:5525, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr 21706510],
>     length 1448
>      00:00:00.000016 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags [P.], seq
>     5525:6097, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr 21706510],
>     length 572
>      00:00:00.000009 IP 10.180.1.29.25369 > 10.180.106.180.22: Flags [.], ack 5525,
>     win 1003, options [nop,nop,TS val 21706510 ecr 2124310129], length 0
>      00:00:00.000303 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags [.], seq
>     6097:7545, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr 21706510],
>     length 1448
>      00:00:00.000019 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags [P.], seq
>     7545:8117, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr 21706510],
>     length 572
>      00:00:00.000013 IP 10.180.1.29.25369 > 10.180.106.180.22: Flags [.], ack 7545,
>     win 1003, options [nop,nop,TS val 21706510 ecr 2124310129], length 0
>      00:00:00.000162 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags [.], seq
>     8117:9565, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr 21706510],
>     length 1448
>      00:00:00.000012 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags [P.], seq
>     9565:10137, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr 21706510],
>     length 572
>      00:00:00.000007 IP 10.180.1.29.25369 > 10.180.106.180.22: Flags [.], ack 9565,
>     win 1003, options [nop,nop,TS val 21706510 ecr 2124310129], length 0
> 
>     What else can I examine to debug the problem further?
>     Thank you!
>     -- 
>     Andriy Gapon
>     _______________________________________________
>     freebsd-net at freebsd.org <mailto:freebsd-net at freebsd.org> mailing list
>     https://lists.freebsd.org/mailman/listinfo/freebsd-net
>     To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org
>     <mailto:freebsd-net-unsubscribe at freebsd.org>"
> 


-- 
Andriy Gapon


More information about the freebsd-net mailing list