kern/179975: igb(4) fails to do polling(4)

Antoine Beaupre anarcat at koumbit.org
Tue Jun 25 19:50:00 UTC 2013


>Number:         179975
>Category:       kern
>Synopsis:       igb(4) fails to do polling(4)
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Jun 25 19:50:00 UTC 2013
>Closed-Date:
>Last-Modified:
>Originator:     Antoine Beaupre
>Release:        FreeBSD 9.1-RELEASE-p3 amd64
>Organization:
Koumbit.org
>Environment:
System: FreeBSD rtr1.koumbit.net 9.1-RELEASE-p3 FreeBSD 9.1-RELEASE-p3 #0 r251605: Mon Jun 10 16:17:26 EDT 2013 root at rtr1.koumbit.net:/usr/obj/usr/src/sys/KOUMBIT1 amd64

>Description:

the igb(4) driver doesn't work properly in polling mode, despite advertising that capability.

This may be related to VLAN configuration here.

igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=400bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO>
        capabilities=505fb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,POLLING,VLAN_HWCSUM,TSO4,LRO,VLAN_HWFILTER,VLAN_HWTSO>
        ether 90:e2:ba:39:d3:7c
        inet 199.58.81.2 netmask 0xffffffc0 broadcast 199.58.81.63
        inet6 fe80::92e2:baff:fe39:d37c%igb0 prefixlen 64 scopeid 0x1
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        supported media:
                media autoselect
                media 1000baseT
                media 1000baseT mediaopt full-duplex
                media 100baseTX mediaopt full-duplex
                media 100baseTX
                media 10baseT/UTP mediaopt full-duplex
                media 10baseT/UTP

here's some bits from the dmesg:

igb0: <Intel(R) PRO/1000 Network Connection version - 2.3.4> mem 0xf7b00000-0xf7b7ffff,0xf7b8c000-0xf7b8ffff irq 16 at
device 0.0 on pci1
igb0: Using MSIX interrupts with 5 vectors
igb0: Ethernet address: 90:e2:ba:39:d3:7c
igb0: Bound queue 0 to cpu 0
igb0: Bound queue 1 to cpu 1
igb0: Bound queue 2 to cpu 2
igb0: Bound queue 3 to cpu 3
igb1: <Intel(R) PRO/1000 Network Connection version - 2.3.4> mem 0xf7a00000-0xf7a7ffff,0xf7b88000-0xf7b8bfff irq 17 at device 0.1 on pci1
igb1: Using MSIX interrupts with 5 vectors
igb1: Ethernet address: 90:e2:ba:39:d3:7d
igb1: Bound queue 0 to cpu 0
igb1: Bound queue 1 to cpu 1
igb1: Bound queue 2 to cpu 2
igb1: Bound queue 3 to cpu 3
igb2: <Intel(R) PRO/1000 Network Connection version - 2.3.4> mem 0xf7980000-0xf79fffff,0xf7b84000-0xf7b87fff irq 18 at device 0.2 on pci1
igb2: Using MSIX interrupts with 5 vectors
igb2: Ethernet address: 90:e2:ba:39:d3:7e
igb2: Bound queue 0 to cpu 0
igb2: Bound queue 1 to cpu 1
igb2: Bound queue 2 to cpu 2
igb2: Bound queue 3 to cpu 3
igb3: <Intel(R) PRO/1000 Network Connection version - 2.3.4> mem 0xf7900000-0xf797ffff,0xf7b80000-0xf7b83fff irq 19 at device 0.3 on pci1
igb3: Using MSIX interrupts with 5 vectors
igb3: Ethernet address: 90:e2:ba:39:d3:7f
igb3: Bound queue 0 to cpu 0
igb3: Bound queue 1 to cpu 1
igb3: Bound queue 2 to cpu 2
igb3: Bound queue 3 to cpu 3

Note that we cannot reproduce the problem with the following interface:

em0: <Intel(R) PRO/1000 Network Connection 7.3.2> port 0xe000-0xe01f mem 0xf7d00000-0xf7d1ffff,0xf7d20000-0xf7d23fff ir
q 18 at device 0.0 on pci3
em0: Using MSIX interrupts with 3 vectors
em0: Ethernet address: 00:25:90:ae:dc:02

>How-To-Repeat:

Here's our kernel configuration:

include GENERIC
ident KOUMBIT0
device          pf
device          pflog
device          pfsync
options         ALTQ
options         ALTQ_CBQ
options         ALTQ_RED
options         ALTQ_RIO
options         ALTQ_HFSC
options         ALTQ_CDNR
options         ALTQ_PRIQ
options   IPSEC        #IP security
device    crypto
options         DEVICE_POLLING
device          carp

In our configuration, we are building a new router (rtr1 FreeBSD 9.1) to replace our old router (rtr0, FreeBSD 8.3).

Configure igb0 with an IP, enable polling and some VLANs:

ifconfig_igb0="inet 199.58.81.2 netmask 255.255.255.192 polling"
cloned_interfaces="vlan141 vlan60"
ifconfig_vlan141="inet 199.58.80.2 netmask 255.255.255.128 vlan 141 vlandev igb0"
ifconfig_vlan60="inet 199.58.82.2 netmask 255.255.255.192 vlan 60 vlandev igb0"
ifconfig_vlan60_alias0="inet 199.58.81.254 netmask 255.255.255.192"

Configure a switch with those VLANs. Make the VLANs 141 and 60 be tagged and share an untagged VLAN for the native interface.

Configure another router with a similar configuration.

Expected results:

 * all those IPs are pingable from the other router
 * the other router should also be pingable
 * adding the other router as a gateway should make it pingable

Actual results:

 * ICMP request packets go out of the interface, are received by the other router which responds
 * ICMP response packets are not picked up by the interface until POLLING is disabled

This may be related with http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/155030

>Fix:

No known fix. Workaround:

ifconfig igb0 -polling
ifconfig igb0 polling

It seems that cycling the polling configuration fixes the problem, at least temporarily.

We have yet to put this server in production and would really, really appreciate some help here. We are a small non-profit ISP, but we are skilled enough to test patches on the kernel tree, and would be ready to offer a bounty (100$?) to see this problem fixed quickly.

Our ETA for production is july 9th.
>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list