packet loss on ixgbe using vlans and routing. Was: packet loss on
ixgbe using vlans and ipv6
John Hay
jhay at meraka.org.za
Wed Jul 21 12:15:49 UTC 2010
Ok, after some more testing, I found that it was not only with ipv6 that
I had packet loss. Routing either ipv4 or ipv6 had some loss.
My test setup is the Dell T710 with its ix2 connected to a 10G port of
a Nortel 4526GTX. On that port I have 2 vlans configured with half of
the 1G ports in the one vlan and the other half in the other vlan.
If I test with iperf from one of the machines on a 1G port to the T710,
I get 920Mbit/s. If I do it simultaneously from a few machines connected
to the 1G ports, all of them basically saturate their 1G links.
If I now try to route from the one vlan to the other, ie. doing an iperf
from a 1G connected machine, through the T710, to another 1G connected
machine, I see packet loss, sometimes 100kbits/s.
So it seems that as long as the T710 with the 10G card is the start or
end point of the connection, I get no packet loss, but as soon as it
has to route, something go wrong.
John
On Tue, Jul 20, 2010 at 06:20:39AM +0200, John Hay wrote:
> On Mon, Jul 19, 2010 at 01:46:18PM -0700, Jeremy Chadwick wrote:
> > On Mon, Jul 19, 2010 at 10:25:42PM +0200, John Hay wrote:
> > > I have a Dell T710 with 4 X 10G ethernet interfaces (2 X Dual port Intel
> > > 82599 cards). It is running FreeBSD RELENG_8 last updated on July 13.
> > >
> > > What I see is packet loss (0 - 40%) on IPv6 packets in vlans, when the
> > > machine is not the originator of the packets.
> > >
> > > Let me try to describe a little more. If a neigbouring machine ping6 it,
> > > there will be packet loss. If it act as a router for ipv6, there will be
> > > packet loss. This happen even when the network is pretty idle and with
> > > different switches (Nortel and Cisco equipment). The packet loss is
> > > very fluctuating. Pinging 1000 packets might loose 1% one time and the
> > > next time 30%. Looking with tcpdump, I can see the packets arriving and
> > > going out, but the packet never arrive at the next machine. (My feeling is
> > > that they get lost inside the card.) The error counters on the switch
> > > does not increment.
> > >
> > > I do not see packet loss if the machine originate the packets, for example
> > > ping6 from the machine. Also ipv4 packets do not have any packets loss. If
> > > I do not use vlans, I don't see packet loss with ipv6 either.
> > >
> > > pciconf -l of the ethernet cards:
> > >
> > > ix0 at pci0:129:0:0: class=0x020000 card=0x00038086 chip=0x10fb8086 rev=0x01 hdr=0x00
> > > ix1 at pci0:129:0:1: class=0x020000 card=0x00038086 chip=0x10fb8086 rev=0x01 hdr=0x00
> > > ix2 at pci0:131:0:0: class=0x020000 card=0x00038086 chip=0x10fb8086 rev=0x01 hdr=0x00
> > > ix3 at pci0:131:0:1: class=0x020000 card=0x00038086 chip=0x10fb8086 rev=0x01 hdr=0x00
> >
> > Can you provide pciconf -lvc output for the ix[0-3] cards instead? I
> > believe Jack Vogel will need this. vmstat -i might also be helpful
> > (full output).
>
> Ok, here is it and also a netstat -m thrown in. The numbers are pretty low
> because I rebooted after compiling a kernel with IPFIREWALL, ROUTETABLES,
> MROUTING and FLOWTABLE removed. I'll add my kernel config file with empty
> and commented out lines removed.
>
> After rebooting, I first tested with vlans (that is in my rc.conf) and then
> tested with the vlans unconfigured on ix2.
>
> ix0 at pci0:129:0:0: class=0x020000 card=0x00038086 chip=0x10fb8086 rev=0x01 hdr=0x00
> vendor = 'Intel Corporation'
> class = network
> subclass = ethernet
> cap 01[40] = powerspec 3 supports D0 D3 current D0
> cap 05[50] = MSI supports 1 message, 64 bit, vector masks
> cap 11[70] = MSI-X supports 64 messages in map 0x20 enabled
> cap 10[a0] = PCI-Express 2 endpoint max data 256(512) link x8(x8)
> ix1 at pci0:129:0:1: class=0x020000 card=0x00038086 chip=0x10fb8086 rev=0x01 hdr=0x00
> vendor = 'Intel Corporation'
> class = network
> subclass = ethernet
> cap 01[40] = powerspec 3 supports D0 D3 current D0
> cap 05[50] = MSI supports 1 message, 64 bit, vector masks
> cap 11[70] = MSI-X supports 64 messages in map 0x20 enabled
> cap 10[a0] = PCI-Express 2 endpoint max data 256(512) link x8(x8)
> ix2 at pci0:131:0:0: class=0x020000 card=0x00038086 chip=0x10fb8086 rev=0x01 hdr=0x00
> vendor = 'Intel Corporation'
> class = network
> subclass = ethernet
> cap 01[40] = powerspec 3 supports D0 D3 current D0
> cap 05[50] = MSI supports 1 message, 64 bit, vector masks
> cap 11[70] = MSI-X supports 64 messages in map 0x20 enabled
> cap 10[a0] = PCI-Express 2 endpoint max data 256(512) link x8(x8)
> ix3 at pci0:131:0:1: class=0x020000 card=0x00038086 chip=0x10fb8086 rev=0x01 hdr=0x00
> vendor = 'Intel Corporation'
> class = network
> subclass = ethernet
> cap 01[40] = powerspec 3 supports D0 D3 current D0
> cap 05[50] = MSI supports 1 message, 64 bit, vector masks
> cap 11[70] = MSI-X supports 64 messages in map 0x20 enabled
> cap 10[a0] = PCI-Express 2 endpoint max data 256(512) link x8(x8)
>
> output of vmstat -i
>
> interrupt total rate
> irq19: ehci0 28371 0
> irq21: uhci2 uhci4+ 48 0
> irq23: atapci0 46 0
> irq34: mpt0 146954 2
> cpu0: timer 112205297 1999
> irq256: bce0 52063 0
> irq257: bce1 1 0
> irq258: bce2 1 0
> irq259: bce3 1 0
> irq260: ix0:que 0 142258 2
> irq261: ix0:que 1 56464 1
> irq262: ix0:que 2 56199 1
> irq263: ix0:que 3 56198 1
> irq264: ix0:que 4 66569 1
> irq265: ix0:que 5 56148 1
> irq266: ix0:que 6 56217 1
> irq267: ix0:que 7 56311 1
> irq268: ix0:que 8 56169 1
> irq269: ix0:que 9 69485 1
> irq270: ix0:que 10 56176 1
> irq271: ix0:que 11 56205 1
> irq272: ix0:que 12 56281 1
> irq273: ix0:que 13 56359 1
> irq274: ix0:que 14 56292 1
> irq275: ix0:que 15 56197 1
> irq276: ix0:link 2 0
> irq277: ix1:que 0 107873 1
> irq278: ix1:que 1 56094 0
> irq279: ix1:que 2 56097 0
> irq280: ix1:que 3 56096 0
> irq281: ix1:que 4 65439 1
> irq282: ix1:que 5 56091 0
> irq283: ix1:que 6 56092 0
> irq284: ix1:que 7 56098 0
> irq285: ix1:que 8 56091 0
> irq286: ix1:que 9 56096 0
> irq287: ix1:que 10 56093 0
> irq288: ix1:que 11 56091 0
> irq289: ix1:que 12 56096 0
> irq290: ix1:que 13 56095 0
> irq291: ix1:que 14 57125 1
> irq292: ix1:que 15 56093 0
> irq293: ix1:link 1 0
> irq294: ix2:que 0 231250 4
> irq295: ix2:que 1 57784 1
> irq296: ix2:que 2 69956 1
> irq297: ix2:que 3 59498 1
> irq298: ix2:que 4 58201 1
> irq299: ix2:que 5 58599 1
> irq300: ix2:que 6 57813 1
> irq301: ix2:que 7 60075 1
> irq302: ix2:que 8 68639 1
> irq303: ix2:que 9 58194 1
> irq304: ix2:que 10 60752 1
> irq305: ix2:que 11 57628 1
> irq306: ix2:que 12 66796 1
> irq307: ix2:que 13 63307 1
> irq308: ix2:que 14 60788 1
> irq309: ix2:que 15 59102 1
> irq310: ix2:link 5 0
> irq311: ix3:que 0 56090 0
> irq312: ix3:que 1 56090 0
> irq313: ix3:que 2 56090 0
> irq314: ix3:que 3 56090 0
> irq315: ix3:que 4 56090 0
> irq316: ix3:que 5 56090 0
> irq317: ix3:que 6 56090 0
> irq318: ix3:que 7 56090 0
> irq319: ix3:que 8 56090 0
> irq320: ix3:que 9 56090 0
> irq321: ix3:que 10 56090 0
> irq322: ix3:que 11 56090 0
> irq323: ix3:que 12 56090 0
> irq324: ix3:que 13 56090 0
> irq325: ix3:que 14 56090 0
> irq326: ix3:que 15 56090 0
> cpu1: timer 112196134 1999
> cpu10: timer 112196179 1999
> cpu3: timer 112196135 1999
> cpu8: timer 112196108 1999
> cpu4: timer 112196161 1999
> cpu11: timer 112196179 1999
> cpu5: timer 112196161 1999
> cpu13: timer 112196179 1999
> cpu6: timer 112196161 1999
> cpu14: timer 112196179 1999
> cpu2: timer 112196106 1999
> cpu12: timer 112196179 1999
> cpu7: timer 112196161 1999
> cpu9: timer 112196155 1999
> cpu15: timer 112196179 1999
> Total 1799390156 32072
>
> netstat -m
>
> 133178/4042/137220 mbufs in use (current/cache/total)
> 133112/2062/135174/262144 mbuf clusters in use (current/cache/total/max)
> 133112/2056 mbuf+clusters out of packet secondary zone in use (current/cache)
> 0/20/20/131072 4k (page size) jumbo clusters in use (current/cache/total/max)
> 0/0/0/65536 9k jumbo clusters in use (current/cache/total/max)
> 0/0/0/32768 16k jumbo clusters in use (current/cache/total/max)
> 299518K/5214K/304733K bytes allocated to network (current/cache/total)
> 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
> 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
> 0/0/0 sfbufs in use (current/peak/max)
> 0 requests for sfbufs denied
> 0 requests for sfbufs delayed
> 0 requests for I/O initiated by sendfile
> 0 calls to protocol drain routines
>
> kernel config file, basically started with 64 bit and removed the stuff
> I do not need.
>
> cpu HAMMER
> ident SEEKAT
> device ipmi
> makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols
> options SCHED_ULE # ULE scheduler
> options PREEMPTION # Enable kernel thread preemption
> options INET # InterNETworking
> options INET6 # IPv6 communications protocols
> options SCTP # Stream Control Transmission Protocol
> options FFS # Berkeley Fast Filesystem
> options SOFTUPDATES # Enable FFS soft updates support
> options UFS_DIRHASH # Improve performance on big directories
> options CD9660 # ISO 9660 Filesystem
> options PROCFS # Process filesystem (requires PSEUDOFS)
> options PSEUDOFS # Pseudo-filesystem framework
> options GEOM_PART_GPT # GUID Partition Tables.
> options GEOM_LABEL # Provides labelization
> options COMPAT_43TTY # BSD 4.3 TTY compat (sgtty)
> options COMPAT_IA32 # Compatible with i386 binaries
> options COMPAT_FREEBSD4 # Compatible with FreeBSD4
> options COMPAT_FREEBSD5 # Compatible with FreeBSD5
> options COMPAT_FREEBSD6 # Compatible with FreeBSD6
> options COMPAT_FREEBSD7 # Compatible with FreeBSD7
> options KTRACE # ktrace(1) support
> options STACK # stack(9) support
> options SYSVSHM # SYSV-style shared memory
> options SYSVMSG # SYSV-style message queues
> options SYSVSEM # SYSV-style semaphores
> options P1003_1B_SEMAPHORES # POSIX-style semaphores
> options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions
> options PRINTF_BUFR_SIZE=128 # Prevent printf output being interspersed.
> options KBD_INSTALL_CDEV # install a CDEV entry in /dev
> options HWPMC_HOOKS # Necessary kernel hooks for hwpmc(4)
> options INCLUDE_CONFIG_FILE # Include this file in kernel
> options SMP # Symmetric MultiProcessor Kernel
> device cpufreq
> device acpi
> device pci
> device ata
> device atapicd # ATAPI CDROM drives
> device mpt # LSI-Logic MPT-Fusion
> device scbus # SCSI bus (required for SCSI)
> device da # Direct Access (disks)
> device pass # Passthrough device (direct SCSI access)
> device atkbdc # AT keyboard controller
> device atkbd # AT keyboard
> device psm # PS/2 mouse
> device kbdmux # keyboard multiplexer
> device vga # VGA video card driver
> device splash # Splash screen and screen saver support
> device sc
> device agp # support several AGP chipsets
> device uart # Generic UART driver
> device loop # Network loopback
> device random # Entropy device
> device ether # Ethernet support
> device pty # BSD-style compatibility pseudo ttys
> device bpf # Berkeley packet filter
> device uhci # UHCI PCI->USB interface
> device ehci # EHCI PCI->USB interface (USB 2.0)
> device usb # USB Bus (required)
> device uhid # "Human Interface Devices"
> device ukbd # Keyboard
> device umass # Disks/Mass storage - Requires scbus and da
> device ums # Mouse
>
> kldstat
> Id Refs Address Size Name
> 1 55 0xffffffff80100000 6ea290 kernel
> 2 1 0xffffffff807eb000 19e088 zfs.ko
> 3 2 0xffffffff8098a000 3860 opensolaris.ko
> 4 2 0xffffffff8098e000 20448 krpc.ko
> 5 1 0xffffffff809af000 21100 geom_mirror.ko
> 6 1 0xffffffff809d1000 66c0 if_vlan.ko
> 7 1 0xffffffff809d8000 506c8 if_bce.ko
> 8 2 0xffffffff80a29000 3ec20 miibus.ko
> 9 1 0xffffffff80a68000 243e0 if_ixgbe.ko
> 10 1 0xffffffff80a8d000 1e08 coretemp.ko
>
> John
> --
> John Hay -- jhay at meraka.csir.co.za / jhay at FreeBSD.org
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
--
John Hay -- jhay at meraka.csir.co.za / jhay at FreeBSD.org
More information about the freebsd-stable
mailing list