RX checksum offloading problem

Michael Tuexen Michael.Tuexen at lurchi.franken.de
Fri May 9 10:47:04 UTC 2014

On 09 May 2014, at 03:35, Yonghyeon PYUN <pyunyh at gmail.com> wrote:

> On Thu, May 08, 2014 at 08:40:22PM +0200, Michael Tuexen wrote:
>> On 07 May 2014, at 10:37, Yonghyeon PYUN <pyunyh at gmail.com> wrote:
>>> On Wed, May 07, 2014 at 10:07:09AM +0200, Michael Tuexen wrote:
>>>> On 07 May 2014, at 09:56, Yonghyeon PYUN <pyunyh at gmail.com> wrote:
>>>>> On Sat, May 03, 2014 at 11:52:47AM +0200, Michael Tuexen wrote:
>>>>>> On 02 May 2014, at 16:02, Bjoern A. Zeeb <bz at FreeBSD.org> wrote:
>>>>>>> On 02 May 2014, at 10:22 , Michael Tuexen <Michael.Tuexen at lurchi.franken.de> wrote:
>>>>>>>> Dear all,
>>>>>>>> during testing I found that FreeBSD head (on a raspberry pi) accepts SCTP packet
>>>>>>>> with bad checksums. After debugging this I figured out that this is a problem with
>>>>>>>> the csum_flags defined in mbuf.h.
>>>>>>>> The SCTP code on its input path checks for CSUM_SCTP_VALID, which is defined in mbuf.h:
>>>>>>>> #define CSUM_SCTP_VALID         CSUM_L4_VALID
>>>>>>>> This makes sense: If CSUM_SCTP_VALID is set in csum_flags, the packet is considered
>>>>>>>> to have a correct checksum.
>>>>>>>> For UDP and TCP some drivers calculate the UDP/TCP checksum and set CSUM_DATA_VALID in
>>>>>>>> csum_flags to indicate that the UDP/TCP should consider csum_data to figure out if
>>>>>>>> the packet has a correct checksum. The problem is that CSUM_DATA_VALID is defined as
>>>>>>>> #define CSUM_DATA_VALID         CSUM_L4_VALID
>>>>>>>> In this case the semantic is not that the packet has a valid checksum, but the csum_data
>>>>>>>> field contains information.
>>>>>>>> Now the following happens (on the raspberry pi the driver used is
>>>>>>>> dev/usb/net/if_smsc.c
>>>>>>>> 1. A packet is received and if it is not too short, the checksum computed
>>>>>>>> is stored in csum_data and the flag CSUM_DATA_VALID is set. This happens
>>>>>>>> for all IP packets, not only for UDP and TCP packets.
>>>>>>>> 2. In case of SCTP packets, the SCTP interprets CSUM_DATA_VALID as CSUM_SCTP_VALID
>>>>>>>> and accepts the packet. So no SCTP checksum check ever happened.
>>>>>>>> Alternatives to fix this:
>>>>>>>> 1. Change all drivers to set CSUM_DATA_VALID only in case of UDP or TCP packets, since
>>>>>>>> it only makes sense in these cases.
>>>>>>> Wait, or for SCTP in cad the crc32 (I think it was)  was actually checked but not otherwise.   This is how it should be imho.  It seems like a driver bug.
>>>>>> I went through the list of drivers and you are right, it seems to be a bug
>>>>>> in if_smsc.c. Most of the other drivers check for UDP/TCP, a small set I can't tell.
>>>>> I'm not sure how the controller computes TCP/UDP checksum values.
>>>>> It seems the publicly available data sheet was highly sanitized so
>>>>> it was useless to me.  The comment in the driver says that the
>>>> Same for me...
>>>>> controller computes RX checksum after the IPv4 header to the end of
>>>>> ethernet frame. After seeing that comment, three questions popped
>>>>> up:
>> OK, I did some testing. It looks like the card is just computing the
>> checksum over the IP payload taking the correct IP header length into account.
>>>>> 1. Is the controller smart enough to skip IP options header in
>>>>> TCP/UDP checksum offloading?
>> Yes, I can send fragmented and un-fragmented UDP packets with IP options
>> and they are handled correctly. Even if the last fragment is too short.
> I'm assuming you're taking about receiving fragmented UDP packets
> with RX checksum offloading, right?
>>>>> 2. How controller handles UDP checksum value 0x0000(i.e. sender
>>>>> didn't compute UDP checksum)?
>> This case isn't handled. However, udp_input() looks first for zero checksums
>> and only after that in the csum_flags. So it doesn't result in any problems.
>> Would you prefer not to set CSUM_DATA_VALID in this case?
> At least, it correctly updates UDP stats of netstat(1).
Let me double check that...
>>>>> 3. How the controller can compute TCP checksum of fragmented
>>>>> packets?
>> At least it does it right for UDP...
> Hmm, CSUM_DATA_VALID indicates driver computed RX TCP/UDP checksum
> without pseudo header. As you know, controller can't compute
> TCP/UDP checksum until all its fragmented payload are read from
> wire.  Packets may arrive out of order and may be mixed with other
I'm not sure I understand this... Please note that the pseudo header
is not taken into account. So the card can compute the checksum over
the payload of IP for each fragment. This is stored in the csum_data
field. During reassembly the csum_data fields of the fragments are
combined in
This looks OK to me. I'm not sure why you think the cards needs
to keep state for this. I understand it needs state if it wants
to take the pseudoheader into account, but this is not done here.
> packets. Advanced controllers with enough memory may be able to
> compute TCP/UDP checksums by tracking each connections(e.g LRO) but
> low-end controllers may be not. It seems the controller does not
> even support RX TCP/UDP pseudo header checksum offloading so I
> wonder how this controller can support RX TCP/UDP checksum
> offloading for fragmented TCP/UDP packets.
I'm not sure what I'm missing here... You compute the sum over
the parts and add the sums of the parts. That should work for
the UDP/TCP checksum.
> Some controllers maintain two bits for TCP/UDP checksum offloading
> result in status word.  One bit is used to indicate whether
> controller performed TCP/UDP checksum offloading and the other bit
> is used to indicate whether the computed checksum is correct or
For SCTP we only get a bit that the checksum was computed and is correct...
> not. For UDP checksum value 0x0000 and fragmented TCP/UDP packets,
> these controllers do not attempt to compute TCP/UDP checksum.
I think it "just" computes it and leaves it up to the upper layer
to use the result or not...

Best regards
>> Best regards
>> Michael
>>>>> Since you have the controller I guess it's easy to verify all
>>>>> cases.  For case 3, I believe the controller can't handle
>>>>> fragmented frames so driver should have to explicitly check ip_off
>>>>> field of IPv4 header.  See how gem(4)/sk(4)/hme(4) and fxp(4)
>>>>> handle it.
>>>> Let me check this. Is there a tool to send UDP/TCP with IP level options
>>>> or do I need to write a small test program myself?
>>> I recall I used buggy ipsend of ipfilter package in the past but it
>>> would be more easy to write a simple test program or patch driver
>>> to generate those frames.

More information about the freebsd-net mailing list