ix(intel) vs mlxen(mellanox) 10Gb performance

Rick Macklem rmacklem at uoguelph.ca
Tue Aug 18 22:04:33 UTC 2015


Hans Petter Selasky wrote:
> On 08/18/15 14:53, Rick Macklem wrote:
> > If this is just a test machine, maybe you could test with these lines (at
> > about #880)
> > in sys/netinet/tcp_output.c commented out? (It looks to me like this will
> > disable TSO
> > for almost all the NFS writes.)
> > - around line #880 in sys/netinet/tcp_output.c:
> > 			/*
> > 			 * In case there are too many small fragments
> > 			 * don't use TSO:
> > 			 */
> > 			if (len <= max_len) {
> > 				len = max_len;
> > 				sendalot = 1;
> > 				tso = 0;
> > 			}
> >
> > This was added along with the other stuff that did the
> > if_hw_tsomaxsegcount, etc and I
> > never noticed it until now (not my patch).
> 
> FYI:
> 
> These lines are needed by other hardware, like the mlxen driver. If you
> remove them mlxen will start doing m_defrag(). I believe if you set the
> correct parameters in the "struct ifnet" for the TSO size/count limits
> this problem will go away. If you print the "len" and "max_len" and also
> the cases where TSO limits are reached, you'll see what parameter is
> triggering it and needs to be increased.
> 
Well, if the driver isn't setting if_hw_tsomaxsegcount correctly, then it
is the driver that needs to be fixed.
Having the above code block disable TSO for all of the NFS writes, including
the ones that set if_hw_tsomaxsegcount correctly doesn't make sense to me.
If the driver authors don't set these, the drivers do lots of m_defrag()
calls. I have posted more than once to freebsd-net@ asking the driver authors
to set these and some now have. (I can't do it, because I don't have the
hardware to test it with.)

I do think that most/all of them don't subtract 1 for the tcp/ip header and
I don't think they should be expected to, since the driver isn't supposed to
worry about the protocol at that level.
--> I think tcp_output() should subtract one from the if_hw_tsomaxsegcount
    provided by the driver to handle this, since it chooses to count mbufs
    (the while() loop at around line #825 in sys/netinet/tcp_output.c.)
    before it prepends the tcp/ip header mbuf.

rick

> --HPS
> 


More information about the freebsd-net mailing list