XL driver checksum producing corrupted but checksum-correct packets

Matthew Dillon dillon at apollo.backplane.com
Sun Jan 25 11:36:37 PST 2004


:> > To pick up the corrupted packet on the machine where the corruption is
:> > occurring, you might want to try hooking up the UDP checksum drop case to
:> > BPF_MTAP() for a special BPF device or rule, or have it spit them into a
:> > raw socket (probably easier).
:> 
:> He said that the packet's checksum passes, but it is corrupt, so this
:> won't work. 
:
:I may have misread: my reading was that the if_xl card marks the packet as
:having passed the checksum test, but if you let the OS do the checksum,
:the checksum fails.  I.e., either the hardware checksumming is broken, or
:the data is corrupted between when the hardware does the checksum, and it
:reaches the OS buffer.  As such, Sam's patch works because it tells the OS
:to ignore the checksum results from the hardware (although it doesn't
:disable the checking of checksums), causing the OS to recalculate the
:checksums and drop the packets rather than accepting them.  The goal of
:the change I suggested would be to also do the checksums in the OS as
:well, which allows you to detect the bad packets, but instead of dropping
:them, funnel them aside for later analysis.   However, if I've misread,
:sorry for the confusion!
:
:Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
    
    Ok, there's something not right... at least for me, the problem is
    on the transmit side.  That is, its my NFS client that has the XL PCI
    card in it, and its a packet that it is transmitting that is getting
    corrupted.  My NFS server is receiving the corrupted packet and accepting
    it (that is, the checksum check on my server on reception is succeeding).
    My server does *NOT* have an XL card in it.

    xl0: <3Com 3cSOHO100-TX OfficeConnect> port 0x9000-0x907f mem 0xe1000000-0xe100007f irq 11 at device 6.0 on pci1

    When I turn off transmit checksums on the client side, the problem does
    not occur.  However, I do not know whether that is because the server is
    now rejecting the packet as having a bad checksum due to the packet
    data being corrupted by the XL card as it is being sent, or whether it
    is because the client is no longer sending a corrupt packet.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>


More information about the freebsd-hackers mailing list