em0 watchdog timeout

Jeremy Chadwick freebsd at jdc.parodius.com
Thu Nov 10 10:03:55 UTC 2011


On Thu, Nov 10, 2011 at 10:22:39AM +0100, Willem Jan Withagen wrote:
> Still running this file server on ZFS, and every now and then em0
> goes down, and is not revivable.... Nothing goes in or out the
> box...
> 
> Any suggestions as how to (help) fix this?

CC'ing Jack Vogel of Intel.

We need "pciconf -lvbc" output (-lv by itself isn't sufficient in this
regard).

Also, please do "sysctl dev.em.0.debug=1", which will show nothing
useful in the output, however "dmesg" shortly after should have a bunch
of driver-level debugging information that should help (output starts
with "Interface is ...".  Please provide that too.

> Nov 10 09:07:41 zfs kernel: em0: Watchdog timeout -- resetting
> Nov 10 09:07:41 zfs kernel: em0: Queue(0) tdh = 187, hw tdt = 189
> Nov 10 09:07:41 zfs kernel: em0: TX(0) desc avail = 1022,Next TX to Clean = 187
> Nov 10 09:11:32 zfs kernel: em0: Watchdog timeout -- resetting
> Nov 10 09:11:32 zfs kernel: em0: Queue(0) tdh = 139, hw tdt = 151
> Nov 10 09:11:32 zfs kernel: em0: TX(0) desc avail = 1012,Next TX to Clean = 139
> Nov 10 09:16:05 zfs kernel: em0: Watchdog timeout -- resetting
> Nov 10 09:16:05 zfs kernel: em0: Queue(0) tdh = 152, hw tdt = 163
> Nov 10 09:16:05 zfs kernel: em0: TX(0) desc avail = 1013,Next TX to Clean = 152
> Nov 10 09:33:10 zfs kernel: em0: Watchdog timeout -- resetting
> Nov 10 09:33:10 zfs kernel: em0: Queue(0) tdh = 161, hw tdt = 176
> Nov 10 09:33:10 zfs kernel: em0: TX(0) desc avail = 1008,Next TX to Clean = 160
> Nov 10 09:53:18 zfs kernel: em0: Watchdog timeout -- resetting
> Nov 10 09:53:18 zfs kernel: em0: Queue(0) tdh = 157, hw tdt = 172
> Nov 10 09:53:18 zfs kernel: em0: TX(0) desc avail = 1009,Next TX to Clean = 157
> 
> Device is:
> Nov 10 10:07:27 zfs kernel: em0: <Intel(R) PRO/1000 Network Connection 7.2.3> port 0x1820-0x183f mem 0xdf900000-0xdf91ffff,0xdf924000-0xdf924fff irq 16 at device 25.0 on pci0
> Nov 10 10:07:27 zfs kernel: em0: Using an MSI interrupt
> Nov 10 10:07:27 zfs kernel: em0: [FILTER]
> 
> pciconf -lv:
> em0 at pci0:0:25:0:        class=0x020000 card=0x10bd15d9
> chip=0x10bd8086 rev=0x02 hdr=0x00
>     vendor     = 'Intel Corporation'
>     device     = 'Intel 82566DM Gigabit Ethernet Adapter (82566DM)'
>     class      = network
>     subclass   = ethernet
> 
> uname:
> 	8.2-STABLE FreeBSD 8.2-STABLE #12: Sun Oct  2 13:36:55 CEST 2011
> 	amd64
> 
> sysctl -a | grep em.0:
> dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.3
> dev.em.0.%driver: em
> dev.em.0.%location: slot=25 function=0 handle=\_SB_.PCI0.LAN_
> dev.em.0.%pnpinfo: vendor=0x8086 device=0x10bd subvendor=0x15d9
> subdevice=0x10bd class=0x020000
> dev.em.0.%parent: pci0
> dev.em.0.nvm: -1
> dev.em.0.debug: -1
> dev.em.0.rx_int_delay: 0
> dev.em.0.tx_int_delay: 66
> dev.em.0.rx_abs_int_delay: 66
> dev.em.0.tx_abs_int_delay: 66
> dev.em.0.rx_processing_limit: 100
> dev.em.0.flow_control: 3
> dev.em.0.eee_control: 0
> dev.em.0.link_irq: 0
> dev.em.0.mbuf_alloc_fail: 0
> dev.em.0.cluster_alloc_fail: 0
> dev.em.0.dropped: 0
> dev.em.0.tx_dma_fail: 0
> dev.em.0.rx_overruns: 6
> dev.em.0.watchdog_timeouts: 5
> dev.em.0.device_control: 1074790976
> dev.em.0.rx_control: 67141634
> dev.em.0.fc_high_water: 8192
> dev.em.0.fc_low_water: 6692
> dev.em.0.queue0.txd_head: 78
> dev.em.0.queue0.txd_tail: 78
> dev.em.0.queue0.tx_irq: 0
> dev.em.0.queue0.no_desc_avail: 0
> dev.em.0.queue0.rxd_head: 376
> dev.em.0.queue0.rxd_tail: 375
> dev.em.0.queue0.rx_irq: 0
> dev.em.0.mac_stats.excess_coll: 0
> dev.em.0.mac_stats.single_coll: 0
> dev.em.0.mac_stats.multiple_coll: 0
> dev.em.0.mac_stats.late_coll: 0
> dev.em.0.mac_stats.collision_count: 0
> dev.em.0.mac_stats.symbol_errors: 0
> dev.em.0.mac_stats.sequence_errors: 0
> dev.em.0.mac_stats.defer_count: 0
> dev.em.0.mac_stats.missed_packets: 9
> dev.em.0.mac_stats.recv_no_buff: 0
> dev.em.0.mac_stats.recv_undersize: 0
> dev.em.0.mac_stats.recv_fragmented: 0
> dev.em.0.mac_stats.recv_oversize: 0
> dev.em.0.mac_stats.recv_jabber: 0
> dev.em.0.mac_stats.recv_errs: 1
> dev.em.0.mac_stats.crc_errs: 1
> dev.em.0.mac_stats.alignment_errs: 0
> dev.em.0.mac_stats.coll_ext_errs: 0
> dev.em.0.mac_stats.xon_recvd: 0
> dev.em.0.mac_stats.xon_txd: 0
> dev.em.0.mac_stats.xoff_recvd: 0
> dev.em.0.mac_stats.xoff_txd: 0
> dev.em.0.mac_stats.total_pkts_recvd: 160062850
> dev.em.0.mac_stats.good_pkts_recvd: 160062840
> dev.em.0.mac_stats.bcast_pkts_recvd: 79648
> dev.em.0.mac_stats.mcast_pkts_recvd: 10220
> dev.em.0.mac_stats.rx_frames_64: 0
> dev.em.0.mac_stats.rx_frames_65_127: 0
> dev.em.0.mac_stats.rx_frames_128_255: 0
> dev.em.0.mac_stats.rx_frames_256_511: 0
> dev.em.0.mac_stats.rx_frames_512_1023: 0
> dev.em.0.mac_stats.rx_frames_1024_1522: 0
> dev.em.0.mac_stats.good_octets_recvd: 107143604749
> dev.em.0.mac_stats.good_octets_txd: 129876768158
> dev.em.0.mac_stats.total_pkts_txd: 179010567
> dev.em.0.mac_stats.good_pkts_txd: 179010567
> dev.em.0.mac_stats.bcast_pkts_txd: 14608
> dev.em.0.mac_stats.mcast_pkts_txd: 206
> dev.em.0.mac_stats.tx_frames_64: 0
> dev.em.0.mac_stats.tx_frames_65_127: 0
> dev.em.0.mac_stats.tx_frames_128_255: 0
> dev.em.0.mac_stats.tx_frames_256_511: 0
> dev.em.0.mac_stats.tx_frames_512_1023: 0
> dev.em.0.mac_stats.tx_frames_1024_1522: 0
> dev.em.0.mac_stats.tso_txd: 3691806
> dev.em.0.mac_stats.tso_ctx_fail: 0
> dev.em.0.interrupts.asserts: 130023913
> dev.em.0.interrupts.rx_pkt_timer: 0
> dev.em.0.interrupts.rx_abs_timer: 0
> dev.em.0.interrupts.tx_pkt_timer: 0
> dev.em.0.interrupts.tx_abs_timer: 0
> dev.em.0.interrupts.tx_queue_empty: 0
> dev.em.0.interrupts.tx_queue_min_thresh: 0
> dev.em.0.interrupts.rx_desc_min_thresh: 0
> dev.em.0.interrupts.rx_overrun: 0
> dev.em.0.wake: 0

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                   Mountain View, CA, US |
| Making life hard for others since 1977.               PGP 4BD6C0CB |



More information about the freebsd-stable mailing list