Network troubles after 8.3 -> 8.4 upgrade

Rick Macklem rmacklem at uoguelph.ca
Thu Apr 17 23:48:06 UTC 2014


John Nielsen wrote:
> On Apr 17, 2014, at 2:38 PM, Andrea Venturoli <ml at netfence.it> wrote:
> 
> > Three days ago I upgraded an amd64 8.3 box to the latest 8.4.
> > Since then the outside network is misbehaving: large mails are not
> > sended (although small ones do), svn operations will work for a
> > while, then come to a sudden stop, etc...
> > Perhaps the most evident test is "wget"ting a big file: it will
> > download some chunk, halt; restart after a while and download
> > another chunk; lose the connection once again, then restart and so
> > on.
> > 
> > I remember a couple of similar experiences in the past, from which
> > I got out by disabling TSO; however those box had fxp cards, while
> > this has an em.
> > In any case disabling TSO did not help.
> 
> My first thought was TSO as well, since I've seen the symptoms you
> describe a few times on systems running 10.0. Do you use IPFW or any
> kind of NAT on this system? When an application encounters a network
> problem, does it report or log anything at all? Anything in the
> kernel log/dmesg?
> 
> A bit of a shot in the dark, but could you try applying r264517
> (fixes a problem with VLAN and TSO interaction)?
> http://svnweb.freebsd.org/base/head/sys/net/if_vlan.c?r1=257241&r2=264517
> 
Since the only net driver that sets if_hw_tsomax is Xen's netfront, the
patch only affects systems that use that at this time. (The bug, which was
also in if_lagg.c, was found during testing of an experimental patch for
a net driver.)

So, I'm pretty sure that patch won't help, rick

> Otherwise my only other thought would be the driver. Can you try
> reverting only the em(4) driver back to 8.3? If that helps it would
> give you both a workaround and a clue for where to look for a
> solution. Build modules and a kernel without em(4) from unmodified
> 8.4 src, load em(4) as a module, confirm that the problem persists.
> Replace the contents of src/sys/dev/e1000, src/sys/modules/em and
> src/sys/conf/files with those from an 8.3 src tree (or otherwise
> revert revision 247430), rebuild em module, unload/reload or reboot,
> see if problem goes away. (Could be somewhat complicated by the fact
> that you also have igb interfaces which also use code from the e1000
> directory, but rather than speculate I'll leave solving that as an
> exercise for someone else.)
> 
> JN
> 
> > This is the relevant part of rc.conf:
> >> cloned_interfaces="lagg0 vlan1 vlan2 vlan3 carp0 carp1 carp3 carp4
> >> carp6 carp7 carp9 carp10"
> >> ifconfig_igb0="up"
> >> ifconfig_igb1="up"
> >> ifconfig_lagg0="laggproto lacp laggport igb0 laggport igb1
> >> 192.168.101.4 netmask 255.255.255.0"
> >> ifconfig_lagg0_alias0="inet 192.168.101.101 netmask 0xffffffff"
> >> ifconfig_carp0="vhid 1 advskew 100 pass xxxxxxx 192.168.101.10"
> >> ifconfig_carp1="vhid 2 pass xxxxxxxx 192.168.101.10"
> >> ifconfig_em0="up"
> >> ifconfig_vlan1="inet 81.174.30.11 netmask 255.255.255.248 vlan 4
> >> vlandev em0"
> >> ifconfig_vlan2="inet 83.211.188.186 netmask 255.255.255.248 vlan 2
> >> vlandev em0"
> >> ifconfig_vlan3="inet 192.168.2.202 netmask 255.255.255.0 vlan 3
> >> vlandev em0"
> >> ifconfig_carp3="vhid 4 advskew 100 pass xxxx 81.174.30.12"
> >> ifconfig_carp4="vhid 5 pass xxxxxxx 81.174.30.12"
> >> ifconfig_carp6="vhid 7 advskew 100 pass xxxxxx 83.211.188.187"
> >> ifconfig_carp7="vhid 8 pass xxxxxxxxxxx 83.211.188.187"
> >> ifconfig_carp9="vhid 10 advskew 100 pass xxxxxxxx 192.168.2.203"
> >> ifconfig_carp10="vhid 11 pass xxxxxxxx 192.168.2.203"
> >> ifconfig_lo0_alias0="inet 127.0.0.2 netmask 0xffffffff"
> >> ifconfig_lo0_alias1="inet 127.0.0.3 netmask 0xffffffff"
> >> ifconfig_lo0_alias2="inet 127.0.0.4 netmask 0xffffffff"
> > 
> > As you can see the setup is quite complicated, but worked like a
> > charm until the upgrade; actually the internal net (igb+lagg+carp)
> > still does, so this is what points me toward em0, where I cannot
> > seem to get any kind of stability.
> > 
> > The card is
> >> em0 at pci0:6:0:0: class=0x020000 card=0x10828086 chip=0x107d8086
> >> rev=0x06 hdr=0x00
> >>    vendor     = 'Intel Corporation'
> >>    device     = 'PRO/1000 PT'
> >>    class      = network
> >>    subclass   = ethernet
> > 
> > I tried disabling TSO, RXCSUM, TXCSUM, VLANHWTAG, VLANHWCSUM,
> > VLANHWTSO...
> > I tried putting the card into 10baseT/UTP <half-duplex> mode...
> > I tried sysctl net.inet.tcp.tso=0...
> > 
> > None helped.
> > 
> > Maybe I'm barking up the wrong tree, but nothing is in the logs to
> > help...
> > 
> > Nor did Google or wading through bug reports.
> > 
> > 
> > 
> > Now I could restore the dumps I made before upgrading to 8.4 (but
> > I'd really like to avoid this), try to upgrade even further to 9.2
> > (although this will be a lot of work and I'm not looking forward
> > to it as a shot in the dark), drop in another NIC...
> > What I'd really like, however, is some insight.
> > 
> > Is this a known problem of some sort? Is this card or this driver
> > known to be broken?
> > Is there any way I could get some debugging info?
> > 
> > Any hint is appreciated (and I need it badly :( !!!).
> > 
> > bye & Thanks
> > 	av.
> > _______________________________________________
> > freebsd-net at freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to
> > "freebsd-net-unsubscribe at freebsd.org"
> > 
> 
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to
> "freebsd-net-unsubscribe at freebsd.org"
> 


More information about the freebsd-net mailing list