icmp packets on em larger than 1472 [SEC=UNCLASSIFIED]

Kevin Oberman oberman at es.net
Thu Nov 11 16:10:58 UTC 2010


> Date: Wed, 10 Nov 2010 23:49:56 -0800 (PST)
> From: Kirill Yelizarov <ykirill at yahoo.com>
> 
> 
> 
> --- On Thu, 11/11/10, Kevin Oberman <oberman at es.net> wrote:
> 
> > From: Kevin Oberman <oberman at es.net>
> > Subject: Re: icmp packets on em larger than 1472 [SEC=UNCLASSIFIED]
> > To: "Wilkinson, Alex" <alex.wilkinson at dsto.defence.gov.au>
> > Cc: freebsd-stable at freebsd.org
> > Date: Thursday, November 11, 2010, 8:26 AM
> > > Date: Thu, 11 Nov 2010 13:01:26
> > +0800
> > > From: "Wilkinson, Alex" <alex.wilkinson at dsto.defence.gov.au>
> > > Sender: owner-freebsd-stable at freebsd.org
> > > 
> > > 
> > >     0n Wed, Nov 10, 2010 at
> > 04:21:12AM -0800, Kirill Yelizarov wrote: 
> > > 
> > >     >All my em cards running
> > 8.1 stable don't reply to icmp echo requests packets larger
> > than 1472 bytes.
> > >     >
> > >     >On stable 7.2 the same
> > hardware works as expected:
> > >     ># ping -s 1500
> > 192.168.64.99
> > >     >PING 192.168.64.99
> > (192.168.64.99): 1500 data bytes
> > >     >1508 bytes from
> > 192.168.64.99: icmp_seq=0 ttl=63 time=1.249 ms
> > >     >1508 bytes from
> > 192.168.64.99: icmp_seq=1 ttl=63 time=1.158 ms
> > >     >
> > >     >Here is the dump on em
> > interface
> > >     >15:06:31.452043 IP
> > 192.168.66.65 > *****: ICMP echo request, id 28729, seq
> > 5, length 1480
> > >     >15:06:31.452047 IP
> > 192.168.66.65 > ****: icmp
> > >     >15:06:31.452069 IP ****
> > > 192.168.66.65: ICMP echo reply, id 28729, seq 5, length
> > 1480
> > >     >15:06:31.452071 IP ***
> > > 192.168.66.65: icmp
> > >     > 
> > >     >Same ping from same source
> > (it's a 8.1 stable with fxp interface) to em card running
> > 8.1 stable
> > >     >#pciconf -lv
> > > 
> >    >em0 at pci0:3:4:0:   
> > class=0x020000 card=0x10798086 chip=0x10798086 rev=0x03
> > hdr=0x00
> > >     >    vendor 
> >    = 'Intel Corporation'
> > >     >    device 
> >    = 'Dual Port Gigabit Ethernet Controller
> > (82546EB)'
> > >     >    class 
> >     = network
> > >     >   
> > subclass   = ethernet
> > >     >
> > >     ># ping -s 1472
> > 192.168.64.200
> > >     >PING 192.168.64.200
> > (192.168.64.200): 1472 data bytes
> > >     >1480 bytes from
> > 192.168.64.200: icmp_seq=0 ttl=63 time=0.848 ms
> > >     >^C
> > >     >
> > >     ># ping -s 1473
> > 192.168.64.200
> > >     >PING 192.168.64.200
> > (192.168.64.200): 1473 data bytes
> > >     >^C
> > >     >--- 192.168.64.200 ping
> > statistics ---
> > >     >4 packets transmitted, 0
> > packets received, 100.0% packet loss
> > > 
> > > works fine for me:
> > > 
> > > FreeBSD 8.1-STABLE #0 r213395
> > > 
> > > em0 at pci0:0:25:0:class=0x020000 card=0x3035103c
> > chip=0x10de8086 rev=0x02 hdr=0x00
> > >     vendor 
> >    = 'Intel Corporation'
> > >     device 
> >    = 'Intel Gigabit network connection
> > (82567LM-3 )'
> > >     class      =
> > network
> > >     subclass   =
> > ethernet
> > > 
> > > #ping -s 1473 host
> > > PING host(192.168.1.1): 1473 data bytes
> > > 1481 bytes from 192.168.1.1: icmp_seq=0 ttl=253
> > time=31.506 ms
> > > 1481 bytes from 192.168.1.1: icmp_seq=1 ttl=253
> > time=31.493 ms
> > > 1481 bytes from 192.168.1.1: icmp_seq=2 ttl=253
> > time=31.550 ms
> > > ^C
> > 
> > The reason the '-s 1500' worked was that the packets were
> > fragmented. If
> > I add the '-D' option, '-s 1473' fails on v7 and v8. Are
> > the V8 systems
> > where you see if failing without the '-D' on the same
> > network segment?
> > If not, it is likely that an intervening device is refusing
> > to fragment
> > the packet. (Some routers deliberately don't fragment ICMP
> > Echos Request
> > packets.) 
> 
> If i set -D -s 1473 sender side refuses to ping and that is
> correct. All mentioned above machines are behind the same router and
> switch. Same hardware running v7 is working while v8 is not. And i
> never saw such problems before.  Also correct me if i'm wrong but the
> dump shows that the packet arrived. I'll try driver from head and will
> post here results.

I did a bit more looking at this today and I see that something bogus is
going on and it MAY be the em driver.

I tried 1473 data byte pings without the DF flag. I then captured the
packets on both ends (where the sending system has a bge (Broadcom GE)
and the responding end has an em (Intel) card.

What I saw was the fragmented IP packets all being received by the
system with the em interface and an ICMP Echo Reply being sent back,
again fragmented. I saw the reply on both ends, so both interfaces were
able to fragment an over-sized packet, transmit the two pieces, and
receive the two pieces. The em device could re-assemble them properly,
but the bge device does not seem to re-assemble them correctly or else
has a problem with ICMP packets bigger then MTU size.

When I send from the em system, I see the packets and fragments all
arrive in good form, but the system never sends out a reply. Since this
is a kernel function, it may be a driver, but I suspect that it is in
the IP stack since I am seeing the problem with a Broadcom card and I
see the data all arriving.

I think Jack can probably relax, but some patch to the network stack
seems to have broken at least ICMP processing. And, since the bge system
ups updated to 8-Stable on October 20 while the em system was updated
back on August 9, I suspect the flaw was not driver related and was
committed between August 9 and Oct. 20.

I think this needs to go to the network list where the folks who tinker
with that part of the kernel tend to hang out. Sorry for the cross-post.
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: oberman at es.net			Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4  EADA 927D EBB3 987B 3751


More information about the freebsd-stable mailing list