9.2 ixgbe tx queue hang
    Christopher Forgeron 
    csforgeron at gmail.com
       
    Tue Mar 25 14:16:58 UTC 2014
    
    
  
Hi guys,
 I'm in meetings today, so I'll respond to the other emails later.
 Just wanted to clarify about tp->t_tsomax : I can't make a solid assertion
about it's value as I only tracked it briefly. I did see it being !=
if_hw_tsomax, but that was a short test and should really be checked more
carefully. For now we should assume it's a possible, but not confirmed.
 However, setting if_hw_tsomax as low as 32k did not fix the problem for
me. So either setting TSO is not the fix, or not everything is paying
attention to if_hw_tsomax. It has to be one or the other.
 Setting IP_MAXPACKET does fix it for me, but of course that's not a solid
fix.
On Tue, Mar 25, 2014 at 9:16 AM, Markus Gebert
<markus.gebert at hostpoint.ch>wrote:
>
> On 25.03.2014, at 02:18, Rick Macklem <rmacklem at uoguelph.ca> wrote:
>
> > Christopher Forgeron wrote:
> >>
> >>
> >>
> >> This is regarding the TSO patch that Rick suggested earlier. (With
> >> many thanks for his time and suggestion)
> >>
> >>
> >> As I mentioned earlier, it did not fix the issue on a 10.0 system. It
> >> did make it less of a problem on 9.2, but either way, I think it's
> >> not needed, and shouldn't be considered as a patch for testing/etc.
> >>
> >>
> >> Patching TSO to anything other than a max value (and by default the
> >> code gives it IP_MAXPACKET) is confusing the matter, as the packet
> >> length ultimately needs to be adjusted for many things on the fly
> >> like TCP Options, etc. Using static header sizes won't be a good
> >> idea.
> >>
> > If you look at tcp_output(), you'll notice that it doesn't do TSO if
> > there are any options. That way it knows that the TCP/IP header is
> > just hdrlen.
> >
> > If you don't limit the TSO packet (including TCP/IP and ethernet headers)
> > to 64K, then the "ix" driver can't send them, which is the problem
> > you guys are seeing.
> >
> > There are other ways to fix this problem, but they all may introduce
> > issues that reducing if_hw_tsomax by a small amount does not.
> > For example, m_defrag() could be modified to use 4K pagesize clusters,
> > but this might introduce memory fragmentation problems. (I observed
> > what I think are memory fragmentation problems when I switched NFS
> > to use 4K pagesize clusters for large I/O messages.)
> >
> > If setting IP_MAXPACKET to 65518 fixes the problem (no more EFBIG
> > error replies), then that is the size that if_hw_tsomax can be set
> > to (just can't change IP_MAXPACKET, but that is defined for other
> > things). (It just happens that IP_MAXPACKET is what if_hw_tsomax
> > defaults to. It has no other effect w.r.t. TSO.)
> >
> >>
> >> Additionally, it seems that setting nic TSO will/may be ignored by
> >> code like this in sys/netinet/tcp_output.c:
> >>
>
> Is this confirmed or still a 'it seems'? Have you actually seen a
> tp->t_tsomax value in tcp_output() bigger than if_hw_tsomax or was this
> just speculation because the values are stored in different places? (Sorry,
> if you already stated this in another email, it's currently hard to keep
> track of all the information.)
>
> Anyway, this dtrace one-liner should be a good test if other values appear
> in tp->t_tsomax:
>
> # dtrace -n 'fbt::tcp_output:entry / args[0]->t_tsomax != 0 &&
> args[0]->t_tsomax != 65518 / { printf("unexpected tp->t_tsomax: %i\n",
> args[0]->t_tsomax); stack(); }'
>
> Remember to adjust the value in the condition to whatever you're currently
> expecting. The value seems to be 0 for new connections, probably when
> tcp_mss() has not been called yet. So that's seems normal and I have
> excluded that case too. This will also print a kernel stack trace in case
> it sees an unexpected value.
>
>
> > Yes, but I don't know why.
> > The only conjecture I can come up with is that another net driver is
> > stacked above "ix" and the setting for if_hw_tsomax doesn't propagate
> > up. (If you look at the commit log message for r251296, the intent
> > of adding if_hw_tsomax was to allow device drivers to set a smaller
> > tsomax than IP_MAXPACKET.)
> >
> > Are you using any of the "stacked" network device drivers like
> > lagg? I don't even know what the others all are?
> > Maybe someone else can list them?
>
> I guess the most obvious are lagg and vlan (and probably carp on FreeBSD
> 9.x or older).
>
> On request from Jack, we've eliminated lagg and vlan from the picture,
> which gives us plain ixgbe interfaces with no stacked interfaces on top of
> it. And we can still reproduce the problem.
>
>
> Markus
>
>
> >
> > rick
> >>
> >> 10.0 Code:
> >>
> >> 780 if (len > tp->t_tsomax - hdrlen) { !!
> >> 781 len = tp->t_tsomax - hdrlen; !!
> >> 782 sendalot = 1;
> >> 783 }
> >>
> >>
> >>
> >>
> >> I've put debugging here, set the nic's max TSO as per Rick's patch (
> >> set to say 32k), and have seen that tp->t_tsomax == IP_MAXPACKET.
> >> It's being set someplace else, and thus our attempts to set TSO on
> >> the nic may be in vain.
> >>
> >>
> >> It may have mattered more in 9.2, as I see the code doesn't use
> >> tp->t_tsomax in some locations, and may actually default to what the
> >> nic is set to.
> >>
> >> The NIC may still win, I didn't walk through the code to confirm, it
> >> was enough to suggest to me that setting TSO wouldn't fix this
> >> issue.
> >>
> >>
> >> However, this is still a TSO related issue, it's just not one related
> >> to the setting of TSO's max size.
> >>
> >> A 10.0-STABLE system with tso disabled on ix0 doesn't have a single
> >> packet over IP_MAXPACKET in 1 hour of runtime. I'll let it go a bit
> >> longer to increase confidence in this assertion, but I don't want to
> >> waste time on this when I could be logging problem packets on a
> >> system with TSO enabled.
> >>
> >>
> >> Comments are very welcome..
> >>
> >>
> >>
> > _______________________________________________
> > freebsd-net at freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
> >
>
>
    
    
More information about the freebsd-net
mailing list