Packet corruption in re0
Pyun YongHyeon
pyunyh at gmail.com
Sun Mar 16 22:12:15 PDT 2008
On Fri, Feb 22, 2008 at 10:43:22AM +0200, Ian FREISLICH wrote:
> Pyun YongHyeon wrote:
> > On Thu, Feb 21, 2008 at 01:18:18PM +0200, Ian FREISLICH wrote:
> > > Pyun YongHyeon wrote:
> > > > On Thu, Feb 21, 2008 at 02:47:43PM +1000, Robert Backhaus wrote:
> > > > > On Thu, Feb 21, 2008 at 1:50 PM, Pyun YongHyeon <pyunyh at gmail.com> wr
> ote:
> > > > > > On Thu, Feb 21, 2008 at 11:03:02AM +1000, Robert Backhaus wrote:
> > > > > > > I am experiencing roughly 15% packet corruption on the re inter
> face
> > > on
> > > > > > > my freebsd 7/amd64 box.
> > > > > > >
> > > > > > > FreeBSD gw.flexi.robbak.com 7.0-PRERELEASE FreeBSD 7.0-PRERELEA
> SE #8
> > > :
> > > > > > > Tue Feb 5 09:49:55 EST 2008
> > > > > > > root at gw.flexi.robbak.com:/usr/obj/usr/src/sys/GW amd64
> > > > > > >
> > > > > > > Just to make troubleshooting difficult, this problem only shows
> up
> > > > > > > after the system has been up for roughly 36 hours, depending on
> the
> > > > > > > amount of traffic.
> > > > > > >
> > > > > >
> > > > > > I didn't take a look attached tcpdump files but I guess the
> > > > > > instability issue was fixed in HEAD. It's not yet MFCed but
> > > > > > I'll handle it in a week.
> > > > > >
> > > > > > Would you try re(4) in HEAD?
> > > > > >
> > > > >
> > > > > OK, I'll do that. What is the best way to do that? csupping to "." se
> ems a
> > > > > bit drastic, and I don't do much with cvs proper. I take it that I sh
> ould
> > > use
> > > > > anon-cvs to grab the directory, but I don't quite know how.
> > > > >
> > > >
> > > > Copy sys/dev/re/if_re.c, sys/pci/if_rlreg.h in HEAD to your box.
> > > > Due to lack of m_defrag(9) in 7-PRERELEASE/RC, you also have to add
> > > > that function to if_re.c(Copy m_defrag() in sys/kern/uipc_mbuf.c on
> > > > HEAD/RELENG_7 to if_re.c). That would make it build on your box.
> > >
> > > This doesn't solve the problem that I'm seeing on re(4) interfaces.
> > > It basically shows up as quagga establishing OSPF neighours as
> > > "Exchange/DR" when VLAN hardware tagging is enabled. I'm running
> > > OSPF over 802.1Q vlans. Neighbours are correctly negotiated once
> > > VLAN hardware tagging is disabled on the interface.
> > >
> > > I'll do more debugging.
> > >
> >
> > Hmm. That sounds like different issue to me. I guess I din't change
> > any semantics in VLAN H/W tagging. Do you still the same VLAN H/W
> > tagging related issues on RELENG_7?
> >
> > To narrow down the issue it would be even better to know which parts
> > of H/W assistance was broken. For example,
> > - Disable checksum offload for VLAN interface first and check
> > whether quagga works.
>
> You can only disable offload on the parent interface.
>
> > - Disable checksum offload for parent interface and check again.
> > If you can post tcpdump output for broken conntection it may help a
> > lot to diagnose the issue.
>
> The only flag affecting this behaviour is vlanhwtag. Various
> permutations of the interface flags make no difference to this
> behaviour as long as hardware tagging is enabled.
>
> It seems like it's corrupting large packets on transmit when vlanhwtag
> is enabled. From the tcpdump output it looks like a padding or
> packet length issue.
>
> Here's what tcpdump on the re(4) device thinks it's transmitting:
>
> 00:08:a1:3c:32:9c > 00:90:fb:0c:89:7d, ethertype 802.1Q (0x8100), length 1510: vlan 1000, p 0, ethertype IPv4, 196.22.138.92 > 196.22.138.89: OSPFv2, Database Description, length: 1472
>
> Here's what was actually recieved by the em(4) device on the
> neighbour. Note the absense of the 801.1Q header:
>
> 00:08:a1:3c:32:9c > 00:90:fb:0c:89:7d, ethertype IPv4 (0x0800), length 1506: 196.22.138.92 > 196.22.138.89: OSPFv2, Database Description, length: 1472
>
> When vlanhwtagging is disabled, the re(4) device transmits:
>
> 00:90:fb:0c:89:7d > 00:08:a1:3c:32:9c, ethertype 802.1Q (0x8100), length 1510: vlan 1000, p 0, ethertype IPv4, 196.22.138.89 > 196.22.138.92: OSPFv2, Database Description, length: 1472
>
> and the em(4) device recieves:
>
> 00:08:a1:3c:32:9c > 00:90:fb:0c:89:7d, ethertype 802.1Q (0x8100), length 1510: vlan 1000, p 0, ethertype IPv4, 196.22.138.92 > 196.22.138.89: OSPFv2, Database Description, length: 1472
>
> Let me know if you need more detailed tcpdump output than I've provided.
>
I guess I've found a VLAN hardware tagging bug in re(4).
Please try this one and let me know the result.
http://people.freebsd.org/~yongari/re/if_re.c
http://people.freebsd.org/~yongari/re/if_rlreg.h
> Ian
>
> --
> Ian Freislich
>
--
Regards,
Pyun YongHyeon
More information about the freebsd-current
mailing list