em(4) on FreeBSD is sometimes annoying

Jeremy Chadwick koitsu at FreeBSD.org
Mon Aug 4 09:13:25 UTC 2008


On Mon, Aug 04, 2008 at 10:53:35AM +0200, Martin wrote:
> Am Sat, 2 Aug 2008 16:01:35 -0700
> schrieb "Jack Vogel" <jfvogel at gmail.com>:
> > > After I typed "/etc/rc.d/netif restart", I waited until I get
> > > "giving up" message. Then I plugged the cable in. After about 30
> > > seconds the link LED was on. I noticed that at this point I
> > > couldn't get an address using DHCP.
> > 
> > Well DUH, the agent exited, thats why it said "giving up" :)
> > That ain't complex behavior, its behaving as designed.
> 
> I'm describing the circumstances WHEN everything happens. I was trying
> to show you that even the cable is plugged in you cannot get an IP. The
> NIC is in a kind of "dead" state.
> 
> > Ya, so the update is slow, the fact that the LED is blinking means you
> > have an autoneg failure, so again, its your switch not the NIC.
> 
> I have this problem with every kind of switch.
> 
> The switch at home is a 100Mbit switch made by Digitus (5-port).

Can you try repeating the problem under Linux?  It may be a bit much to
ask, but I believe there's an Ubuntu Live CD you can download + burn +
boot.  You could try repeating the behaviour there.  If it's identical,
or at least "still broken", then it's less likely FreeBSD's fault.

> > Let me guess, you have some 100Mb home router and you are trying
> > to plug a gig nic into it and forcing the speed maybe?
> 
> This is true except for the "forcing the speed" part. It's set to
> "media: Ethernet autoselect".

Which means it's using auto-neg, which Jack says (based on the
information he has) may be failing upon link loss + reconnect.  As
described, auto-negotiation has to be properly implemented on both the
NIC/PHY and on the switch, as well as handled properly in the NIC
driver.

I can tell you that in the case of the Intel 82573E and FreeBSD's em(4)
driver (version 6.x.x), auto-neg is performed properly, including when
link is lost/cable pulled.  I've personally tested this on numerous
consumer switches (D-Link, Linksys, and Hawking Technologies), as well
as enterprise switches (specifically ProCurve and Cisco).  I can tell
you that I've seen odd speed negotiation failures with Netgear consumer
switches (100mbit being chosen instead of gigE).

In fact, this weekend in my home, I just migrated from a D-Link switch
to an HP ProCurve switch.  I powered off one switch, installed the new
one, powered it on, and link came up.  Took a couple minutes.  But then
I decided to re-organise some of my cabling, which caused another
disconnect.  Here's evidence:

em0: <Intel(R) PRO/1000 Network Connection 6.9.5> port 0x4000-0x401f mem 0xe8000000-0xe801ffff irq 16 at device 0.0 on pci13
em0: Using MSI interrupt
em0: [FILTER]
em0: Ethernet address: 00:30:48:97:25:40

em0 at pci0:13:0:0:        class=0x020000 card=0x108c15d9 chip=0x108c8086 rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82573E Intel Corporation 82573E Gigabit Ethernet Controller (Copper)'
    class      = network
    subclass   = ethernet

icarus# bzgrep "kernel: em0" /var/log/all.log.3.bz2
Jul 31 06:28:23 icarus kernel: em0: link state changed to DOWN
Jul 31 06:30:17 icarus kernel: em0: link state changed to UP
Jul 31 06:32:36 icarus kernel: em0: link state changed to DOWN
Jul 31 06:32:53 icarus kernel: em0: link state changed to UP

And absolutely no problems:

icarus# netstat -in -I em0
Name    Mtu Network       Address              Ipkts Ierrs    Opkts Oerrs  Coll
em0    1500 <Link#1>      00:30:48:97:25:40 32941661     0 34620277     0     0
em0    1500 192.168.1.0/2 192.168.1.51      32915748     - 35942133     -     -

icarus# ifconfig em0
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=19b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4>
        ether 00:30:48:97:25:40
        inet 192.168.1.51 netmask 0xffffff00 broadcast 192.168.1.255
        media: Ethernet autoselect (1000baseTX <full-duplex>)
        status: active

What I'm saying is "I don't know what to tell you".  I'm not doubting
your claims, but it would be worthwhile to test on Linux to see if it's
a FreeBSD driver issue, something with the NIC/PHY, the way the NIC/PHY
is implemented on the computer, or even the cable (yes really!).  I'd
start with the obvious: try replacing the cable, and go with a CAT5e
cable that's pre-made (rather than self-rolled, if you're using such).

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |



More information about the freebsd-stable mailing list