SACK scoreboard accounting improvement
Richard.Scheffenegger at netapp.com
Fri May 1 21:41:21 UTC 2020
D18624, which improves SACK scoreboard accounting (and as a side-effect, some exploits / gaming of SACK loss recovery by deliberately omitting all but one SACK block or shrinking the repored size of SACK blocks) has been waiting for formal review for nearly 1,5 years (!) now. (Yes, I did repeatedly bring this set of patches up in the transport call)
JTL@ did have a casual read through the change, and didn't find any blatant obvious mistakes - while some minor difference of SACK loss recovery in certain corner cases (more than 3 distinct SACK blocks) is to be expected (with the updated code sending out data more in conformance with RFC3517). Obviously, I am sufficiently confident to have done my homework (running instrumented kernels, tracking the old vs. new calculated values and finding how to "game" the currently existing SACK scoreboard - mostly to the disadvantage of a sender, but also how to temporarily get data slightly ahead of time).
Also, D18624 is the basis to D18892 (Proportional Rate Reduction, RFC6937) - where every (fractional) n ACKs a new/retransmitted data segment is sent, rather than waiting until sufficient packets have drained the network (hoping that no ACK thinning has occurred), and then sending at the original pace (but with a minuscule pause) the new/retransmit segments.
Furthermore, D18624 is also the basis for D18985 (a partial implementation of RFC6675 SACK, primarily delivering the rescue retransmission).
I would like to propose a timeout schema for these three Diffs now, like we have done for a select few others in the past. With a cadence of ~4 weeks in between, apply these to HEAD, ready to revert them if necessary. After ~16 weeks of the initial commit, MFC to stable/12 (RFC6675 SACK and RFC6937 PRR are new features, so no MFC to /11; perhaps D18624 itself may be a candidate eventually to /11, because of slim chance of a remote, temporarily exploitation of a session).
Note: Some packetdrill scripts will have to be adjusted, as SACK timing / rescue retransmission may interfere with current test/regression scripts.
More information about the freebsd-transport