em driver losing packets

Scott T. Smith scott at gelatinous.com
Wed May 12 14:28:11 PDT 2004


I have a Sun 1U server with 2 built in Intel Pro/1000 "LOMs" (though I
had the exact same problem with a previous machine using a standalone
Intel NIC).  I notice that after the machine has been up for 12-20
hours, the network card starts dropping packets.

Here is the relevant dmesg info:

em0: <Intel(R) PRO/1000 Network Connection, Version - 1.7.25> port
0x2040-0x207f mem 0xfe680000-0xfe69ffff irq 30 at device 7.0 on pci3
em0:  Speed:N/A  Duplex:N/A
em1: <Intel(R) PRO/1000 Network Connection, Version - 1.7.25> port
0x2000-0x203f mem 0xfe6a0000-0xfe6bffff irq 31 at device 7.1 on pci3
em1:  Speed:N/A  Duplex:N/A

....

em0: Link is up 100 Mbps Full Duplex
em1: Link is up 1000 Mbps Full Duplex

....

Limiting icmp unreach response from 1770 to 200 packets/sec
^^^ Not sure what this is, but I received a bunch of them after
everything was working and before everything stopped working
....

em1: Excessive collisions = 0
em1: Symbol errors = 0
em1: Sequence errors = 0
em1: Defer count = 0
em1: Missed Packets = 1682
em1: Receive No Buffers = 75
em1: Receive length errors = 0
em1: Receive errors = 0
em1: Crc errors = 0
em1: Alignment errors = 0
em1: Carrier extension errors = 0
em1: XON Rcvd = 0
em1: XON Xmtd = 0
em1: XOFF Rcvd = 0
em1: XOFF Xmtd = 0
em1: Good Packets Rcvd = 119975570
em1: Good Packets Xmtd = 164
em1: Adapter hardware address = 0xc76262ec 
em1:tx_int_delay = 66, tx_abs_int_delay = 66
em1:rx_int_delay = 488, rx_abs_int_delay = 977
em1: fifo workaround = 0, fifo_reset = 0
em1: hw tdh = 170, hw tdt = 170
em1: Num Tx descriptors avail = 256
em1: Tx Descriptors not avail1 = 0
em1: Tx Descriptors not avail2 = 0
em1: Std mbuf failed = 0
em1: Std mbuf cluster failed = 0
em1: Driver dropped packets = 0


I was running 5.2.1-RELEASE with em driver version 1.7.19 or 1.7.17 (I
forget what it comes with).  I
had the problems so I backported 1.7.25 from 5.2.1-STABLE as of May 10. 
Same issue.

Notice the "missed packets" and "receive no buffers".  I assume that
means the network card ran out of memory?  How much memory does it
have?  If it uses the mainboard memory, can I make that amount any
bigger?

The odd thing (which is why I think this is a driver issue) is that it
works just fine when the machine is first booted.

I am driving approximately 680 mbits/sec of UDP traffic; 1316 byte
packets.  The only other traffic is arp traffic (em1 has a netmask of
255.255.255.255).

I have this problem whether I use kernel polling (HZ=1000) or with
rx_abs_int_delay=1000, or with rx_abs_int_delay=500.  If I shut off the
rx_*int_delay, then CPU load goes to 100% and I still have the same
problem.  With the abs delay at 1000, cpu load is 90% (about split
evenly between user and system).

If you have any ideas I'd really appreciate it.  Thanks!  I'm thinking
of trying to backport 1.7.31.

        Scott




More information about the freebsd-net mailing list