anyone seeing problems with bce driver?

Julian Elischer julian at elischer.org
Fri Aug 11 22:07:58 UTC 2006


We're seeing the following problems with a 6.1 based system with a bce 
interface.
I've looked at -current and RELENG_6 and don't see any changes that 
might affect this..

before I go diving further into the code, does anyone recognise this?
Looks like possibly the constantly incoming packets stop the PHY from 
getting reinitialised.
I am guessing that when reinitiing we should stop new data from hitting 
the chip.
I'm guessing this is what should be happenning but there is a window 
somewhere
that is letting some through.

----problem reported by admin of machine in question: ----

If data is being sent out on the BCE interfaces, pull out the ethernet wire 
and stick it back in,  The bce interface will be off the network until you 
stop all users from trying to send data, and /var/log/messages will show 
the watchdog timer resetting my connection every 10 seconds or so.  Once you 
stop your network traffic, the interface will come up and be functioning, but with
the frequency we have been seeing this, networks have frequent
hiccups that put the interface into this state. usually there is a gap in 
the traffic every now and then that allows the NIC to recover.


The repeated watchdog resets can be seen in /var/log/messages:
Aug 10 19:50:14  kernel: bce1: ../../../dev/bce/if_bce.c(5037): Watchdog
timeout occurred, resetting!
Aug 10 19:50:14  kernel: bce1: link state changed to DOWN
Aug 10 19:50:16  kernel: bce1: link state changed to UP
Aug 10 19:50:26  kernel: bce1: ../../../dev/bce/if_bce.c(5037): Watchdog
timeout occurred, resetting!
Aug 10 19:50:26  kernel: bce1: link state changed to DOWN
Aug 10 19:50:28  kernel: bce1: link state changed to UP

Repro:
1) Have the machine sending out data.  I was pinging packets off another box.
   Since the box is going to fall off the network, you better be using the
   console.
     ping -f -s 1500 otherbox > /dev/null &
2) Pull the wire going into bce1 for a few seconds and put it back.
3) bce1 will no longer be usable.
4) Kill the ping command to allow the bce1 interface to work again.


-------------



More information about the freebsd-current mailing list