kern/167325: sosend sometimes return EINVAL with TSO and VLAN on 82599 NIC

Greg Becker greg at
Thu Apr 26 14:00:22 UTC 2012

>Number:         167325
>Category:       kern
>Synopsis:       sosend sometimes return EINVAL with TSO and VLAN on 82599 NIC
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Apr 26 14:00:22 UTC 2012
>Originator:     Greg Becker
>Release:        FreeBSD-8.2
FreeBSD 8.2-RELEASE FreeBSD 8.2-RELEASE #84: Wed Apr 25 20:34:52 CDT 2012     greg at  amd64

At CacheIQ we discovered a problem where sosend() would occassionally return EINVAL.  This is pretty bad, as it leads our software to believe something was wrong with the calling arguments or perhaps the socket is in a problematic state.

Digging into it we found that every now and then requests larger than dmat->maxsize are rejected by bus_dmamap_load_mbuf_sg().

It's important to note that in our environment we are using the Intel 82599 10Gb/s adapter with TSO enabled and VLAN tagging (QINQ).

The problem only occurs for TSO packets generated by tcp_output().  When the failure occurs, we see that tcp_ouput() submits an mbuf chain to ip_output() that is 65520 bytes long.  ip_output() adds on the link header bringing it to 65534 bytes.  Finally, vlan_tag() adds 4 more bytes bringing it to 65538 bytes, and thus failing the lower level checks.

As a quick fix we factor in max_linkhdr into the TSO length adjustment in tcp_output().

Enable TSO and VLAN tagging for 82599 NIC, and drive a lot of data through the system.  We use a custom tool, but any networking tool that can keep the link saturated should illustrate the problem.

Note, our software is based on FreeBSD-8.2, but includes a number of fixes and select updates.
Factor max_linkhdr into the TSO length adjustment in tcp_output().

Patch attached with submission follows:

Index: freebsd/RELENG_8/src/sys/netinet/tcp_output.c
--- freebsd/RELENG_8/src/sys/netinet/tcp_output.c	(revision 5803)
+++ freebsd/RELENG_8/src/sys/netinet/tcp_output.c	(working copy)
@@ -747,9 +747,9 @@
 	if (len + optlen + ipoptlen > tp->t_maxopd) {
 		flags &= ~TH_FIN;
 		if (tso) {
-			if (len > TCP_MAXWIN - hdrlen - optlen) {
-				len = TCP_MAXWIN - hdrlen - optlen;
-				len = len - (len % (tp->t_maxopd - optlen));
+			if (len > TCP_MAXWIN - hdrlen - optlen - max_linkhdr) {
+				len = TCP_MAXWIN - hdrlen - optlen - max_linkhdr;
+				len = len - (len % (tp->t_maxopd - optlen - max_linkhdr));
 				sendalot = 1;
 			} else if (tp->t_flags & TF_NEEDFIN)
 				sendalot = 1;


More information about the freebsd-bugs mailing list