svn commit: r197391 - head/sys/dev/mxge

Andrew Gallatin gallatin at cs.duke.edu
Tue Sep 22 13:44:44 UTC 2009


Robert Watson wrote:
 >
 > On Mon, 21 Sep 2009, Andrew Gallatin wrote:
 >
 >>  Add support for throttling transmit bandwidth.  This is most commonly
 >>  used to reduce packet loss on high delay (WAN) paths with a
 >>  slow link.
 >
 > Hi Drew--
 >
 > Could you say a little more about the situations in which this is used?
 > I see (or think I see) that the card supports internal throttling, but

As you say, the card supports it, and I ported it from our Linux and
Solaris drivers because we've had a few requests for it.  I don't
pretend to know a lot about WAN tuning, but our Linux customers claim
this works better for them than the Linux host based traffic shaping.

 > is there a reason, other than the hardware supporting it, for not to do
 > something like this higher in the stack before cycles are burned and PCI
 > bus bandwidth has been wasted?

This throttling increases neither CPU usage nor PCI bus bandwidth
usage.  It uses a very simple mechanism of having the NIC insert
delays between issuing PCIe DMA reads (its slightly more complex than
this, but that's the easiest way to think of it).  This decreases the
maximum transmit bandwidth opaquely to the host.  Throttle it enough,
and it is almost as if you placed the NIC into a slower PCIe slot.
The key is that only DMA reads are slowed, so the NIC can still
receive at full bandwidth.

The advantage over host based traffic shaping is that no software in
the host is involved, so it uses no host CPU.  It is also effective at
throttling TCP connections where TSO is in use.  This means you don't
either have to disable TSO, or worry about 64KB bursts at full
bandwidth like you would with a host based traffic shaping solution.

In addition to WAN scenarios, I should also have mentioned that it is
useful in LAN scenarios.  Picture a cluster of machines based around
some PCIe chipset which has an unbalanced read/write DMA performance
(the HT2000 is one such chipset which can send at line rate, but can
only receive at 7Gb/s).  If link level flow control is not effective
then you wind up with massive packet loss.  TCP can sometimes deal
with this, but UDP is just miserable when you have a steady state
where the sender can send faster than the receiver can receive.  In
fact, one of the scenarios where this was most helpful was a project
where data was being collected on machine A, and forwarded to machine
B via UDP.  For security reasons, the physical link was simplex (only
one physical fiber between the TX laser on A and RX on B), so neither
TCP, nor link level flow control was possible.  These were also
"unbalanced" machines.

Drew


More information about the svn-src-head mailing list