RELENG_7 em problems (and RELENG_8)
Mike Tancsa
mike at sentex.net
Fri Jul 2 17:39:24 UTC 2010
Hi Jack,
Just a followup to the email below. I now saw what appears
to be the same problem on RELENG_8, but on a different nic and with
VLANs. So not sure if this is a general em problem, a problem
specific to some em NICs, or a TSO problem in general. The issue
seemed to be triggered when I added a new vlan based on
em3 at pci0:14:0:0: class=0x020000 card=0x109a15d9
chip=0x109a8086 rev=0x00 hdr=0x00
vendor = 'Intel Corporation'
device = 'Intel PRO/1000 PL Network Adaptor (82573L)'
class = network
subclass = ethernet
cap 01[c8] = powerspec 2 supports D0 D3 current D0
cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
pci14: <ACPI PCI bus> on pcib5
em3: <Intel(R) PRO/1000 Network Connection 7.0.5> port 0x6000-0x601f
mem 0xe8300000-0xe831ffff irq 17 at device 0.0 on pci14
em3: Using MSI interrupt
em3: [FILTER]
em3: Ethernet address: 00:30:48:9f:eb:81
em3: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST>
metric 0 mtu 1500
options=2098<VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC>
ether 00:30:48:9f:eb:81
inet 10.255.255.254 netmask 0xfffffffc broadcast 10.255.255.255
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
I had to disable tso, rxcsum and txsum in order to see the devices on
the other side of the two vlans trunked off em3. Unfortunately, the
other sides were switches 100km and 500km away so I didnt have any
tcpdump capabilities to diagnose the issue. I had already created
one vlan off this NIC and all was fine. A few weeks later, I added a
new one and I could no longer telnet into the remote switches from
the local machine.... But, I could telnet into the switches from
machines not on the problem box. Hence, it would appear to be a
general TSO issue no ? I disabled tso on the nic (I didnt disable
net.inet.tcp.tso as I forgot about that).. Still nothing. I could
always ping the remote devices, but no tcp services. I then
remembered this issue from before, so I tried disabling tso on the
NIC. Still nothing. Then I disabled rxcsum and txcsum and I could
then telnet into the remote devices.
This newly observed issue was from a buildworld on Mon Jun 14
11:29:12 EDT 2010.
I will try and recreate the issue locally again to see if I can
trigger the problem on demand. Any thoughts on what it might be ?
Perhaps an issue specific to certain em nics ?
---Mike
At 04:31 PM 6/10/2010, Mike Tancsa wrote:
>Hi Jack,
> I am seeing some issues on RELENG_7 with a specific em nic
>
>em2 at pci0:13:0:0: class=0x020000 card=0x108c15d9
>chip=0x108c8086 rev=0x03 hdr=0x00
> vendor = 'Intel Corporation'
> device = 'Intel Corporation 82573E Gigabit Ethernet
> Controller (Copper) (82573E)'
> class = network
> subclass = ethernet
> cap 01[c8] = powerspec 2 supports D0 D3 current D0
> cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
> cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
>
>If I disable tso, I am not able to make a tcp connection into the host
>
>eg
>0[psbgate1]# ifconfig em2
>em2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>
>options=219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC>
> ether 00:30:48:9f:eb:80
> inet 192.168.128.200 netmask 0xfffffff0 broadcast 192.168.128.207
> media: Ethernet autoselect (100baseTX <full-duplex>)
> status: active
>0[psbgate1]# ifconfig em2 -tso
>0[psbgate1]#
>
>
>Looking at the pcap, the checksum is bad on the syn-ack. If I
>re-enable tso, it seems to be ok
>
>16:18:01.113297 IP (tos 0x10, ttl 64, id 6339, offset 0, flags [DF],
>proto TCP (6), length 60) 192.168.128.196.54172 >
>192.168.128.200.22: S, cksum 0x4e79 (correct),
>3313156149:3313156149(0) win 65535 <mss 1460,nop,wscale
>3,sackOK,timestamp 3376174416 0>
>16:18:01.123676 IP (tos 0x0, ttl 64, id 3311, offset 0, flags [DF],
>proto TCP (6), length 60) 192.168.128.200.22 >
>192.168.128.196.54172: S, cksum 0x81c9 (incorrect (-> 0x51f2),
>1373042663:1373042663(0) ack 3313156150 win 65535 <mss
>1460,nop,wscale 3,sackOK,timestamp 1251567646 3376174416>
>
>
>em2: <Intel(R) PRO/1000 Network Connection 7.0.5> port 0x5000-0x501f
>mem 0xe8200000-0xe821ffff irq 16 at device 0.0 on pci13
>em2: Using MSI interrupt
>em2: [FILTER]
>em2: Ethernet address: 00:30:48:9f:eb:80
>pcib5: <ACPI PCI-PCI bridge> irq 16 at device 28.5 on pci0
>pci14: <ACPI PCI bus> on pcib5
>em3: <Intel(R) PRO/1000 Network Connection 7.0.5> port 0x6000-0x601f
>mem 0xe8300000-0xe831ffff irq 17 at device 0.0 on pci14
>em3: Using MSI interrupt
>em3: [FILTER]
>em3: Ethernet address: 00:30:48:9f:eb:81
>
>
>Also there is still the issue with
>
>http://lists.freebsd.org/pipermail/freebsd-stable/2009-November/052842.html
>
>in RELENG_7 ?
>
> ---Mike
>
>
>--------------------------------------------------------------------
>Mike Tancsa, tel +1 519 651 3400
>Sentex Communications, mike at sentex.net
>Providing Internet since 1994 www.sentex.net
>Cambridge, Ontario Canada www.sentex.net/mike
>
>_______________________________________________
>freebsd-stable at freebsd.org mailing list
>http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
--------------------------------------------------------------------
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, mike at sentex.net
Providing Internet since 1994 www.sentex.net
Cambridge, Ontario Canada www.sentex.net/mike
More information about the freebsd-stable
mailing list