[Bug 268910] ixgbe(4): rxcsum register setting can cause TCP connection hangs
Date: Thu, 12 Jan 2023 17:34:29 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=268910 Bug ID: 268910 Summary: ixgbe(4): rxcsum register setting can cause TCP connection hangs Product: Base System Version: 12.3-RELEASE Hardware: amd64 OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: brian90013@gmail.com Hello, I started diagnosing a FreeBSD-12.3 system with a X550-T2 10GbE controller that was having trouble making outgoing TCP connections (ssh for example). I used tcpdump and could see a response returning to the host but the client software didn't seem to receive the packet and move forward. The port was configured for 1 queue per direction (dev.ix.0.iflib.override_nrxqs=1 and dev.ix.0.iflib.override_ntxqs=1) and all offloads were disabled. I discovered that enabling rxcsum restored stable networking. I could complete ssh connections as well as generic TCP/UDP connections using nc. As soon as I removed the rxcsum offload, the initial behavior returned. I also discovered setting the number of queues > 1 was stable while queues=1 hung connections. I looked at ixgbe_initialize_receive_units() in if_ix.c where the RXCSUM register is set. Here's the relevant section: rxcsum = IXGBE_READ_REG(hw, IXGBE_RXCSUM); ixgbe_initialize_rss_mapping(sc); if (sc->num_rx_queues > 1) { /* RSS and RX IPP Checksum are mutually exclusive */ rxcsum |= IXGBE_RXCSUM_PCSD; } if (ifp->if_capenable & IFCAP_RXCSUM) rxcsum |= IXGBE_RXCSUM_PCSD; /* This is useful for calculating UDP/IP fragment checksums */ if (!(rxcsum & IXGBE_RXCSUM_PCSD)) rxcsum |= IXGBE_RXCSUM_IPPCSE; IXGBE_WRITE_REG(hw, IXGBE_RXCSUM, rxcsum); Not hard to see that if either queues>1 OR rxcsum is enabled the PCSD bit is set and IPPCSE is not set. In the case where queues==1 AND !rxcsum, PCSD is not set and IPPCSE is set (and my TCP connections hang). I believe there is a problem here but I'm not sure which part. A few notes: * The linux driver unconditionally sets PCSD and doesn't touch IPPCSE in ixgbe_set_mrqc(). * The e1000 FreeBSD driver sets PCSD if rx_queues > 1. It only calls xxx_initialize_rss_mapping() if rx_queues > 1 is true. ixgbe calls xxx_initialize_rss_mapping outside of the conditional so it is always run. Does this mean PCSD must always be set? * I couldn't find anywhere in the driver where the UDP/IP fragment checksum was used. Is there any benefit to setting IPPCSE? I modified the above code to always set PCSD and never set IPPCSE. In my testing on the X550-T2 and an optical X520 82599ES, this change eliminated the hangs even with queues=1 and -rxcsum. I'm hoping this is enough detail for someone to identity the true root cause and best solution. Please let me know if you need additional information. -- You are receiving this mail because: You are the assignee for the bug.