Intermittent network issues with Freebsd 6.2
Jeremy Chadwick
koitsu at FreeBSD.org
Wed Feb 28 09:20:01 UTC 2007
On Thu, Feb 15, 2007 at 03:54:18PM +1100, Dimuthu Parussalla wrote:
> Hi,
>
> Dmesg output related to bge as follows.
>
> miibus0: <MII bus> on bge0
> brgphy0: <BCM5750 10/100/1000baseTX PHY> on miibus0
> brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX,
> 1000baseTX-FDX, auto
> bge0: Ethernet address: 00:11:25:e9:7f:58
> bge0: [GIANT-LOCKED]
> pcib6: <ACPI PCI-PCI bridge> at device 5.0 on pci0
> pci8: <ACPI PCI bus> on pcib6
> bge1: <Broadcom BCM5750 B1, ASIC rev. 0x4101> mem 0xc6ff0000-0xc6ffffff irq
> 16 at device 0.0 on pci8
> miibus1: <MII bus> on bge1
> brgphy1: <BCM5750 10/100/1000baseTX PHY> on miibus1
> brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX,
> 1000baseTX-FDX, auto
> bge1: Ethernet address: 00:11:25:e9:7f:59
> bge1: [GIANT-LOCKED]
Interestingly enough, this problem just started haunting us too (out
of no where), on one of our Supermicro systems. There haven't been
any changes to the network in literally months (no one's been to the
datacenter since December).
Here's our details:
* Upstream switch is an HP ProCurve 2626 . All ports used are
100mbit, with auto-select enabled (speed/duplex neg)
* Speed/duplex negotiation is being done correctly. We have no
throughput problems (either direction) or otherwise
* netstat -i -n shows no errors, except for two output errors,
which are probably due to the interface being brought down
and back up rudely (see below)
* Switch shows no errors on either interface
* Cabling is good (CAT6 none the less)
* Uniprocessor system; kernel not built with SMP
What we see:
Feb 17 11:22:00 eos kernel: bge0: watchdog timeout -- resetting
Feb 17 11:22:00 eos kernel: bge0: link state changed to DOWN
Feb 17 11:22:01 eos kernel: bge0: link state changed to UP
Feb 24 11:20:56 eos kernel: bge0: watchdog timeout -- resetting
Feb 24 11:20:56 eos kernel: bge0: link state changed to DOWN
Feb 24 11:20:58 eos kernel: bge0: link state changed to UP
These timestamps are awfully suspicious; exactly 7 days apart,
almost to the hour? And no, we have no cronjobs or anything else
that runs at that time (this box is hardly used for anything).
Applicable system information:
(I'm including ichsmb/smbus because it shares an IRQ with bge1;
nothing shares an IRQ with bge0)
bge0: <Broadcom BCM5750 B1, ASIC rev. 0x4101> mem 0xd0100000-0xd010ffff irq 18 at device 0.0 on pci4
miibus0: <MII bus> on bge0
brgphy0: <BCM5750 10/100/1000baseTX PHY> on miibus0
brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto
bge0: Ethernet address: 00:30:48:81:fc:8a
pcib5: <ACPI PCI-PCI bridge> irq 19 at device 28.3 on pci0
pci5: <ACPI PCI bus> on pcib5
bge1: <Broadcom BCM5750 B1, ASIC rev. 0x4101> mem 0xd0200000-0xd020ffff irq 19 at device 0.0 on pci5
miibus1: <MII bus> on bge1
brgphy1: <BCM5750 10/100/1000baseTX PHY> on miibus1
brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto
bge1: Ethernet address: 00:30:48:81:fc:8b
ichsmb0: <SMBus controller> port 0x500-0x51f irq 19 at device 31.3 on pci0
ichsmb0: [GIANT-LOCKED]
smbus0: <System Management Bus> on ichsmb0
smb0: <SMBus generic I/O> on smbus0
Odd that pciconf -lv shows this as a BCM5750 A1 while the kernel shows
this as a BCM5750 B1. Is this indicative of anything?
bge0 at pci4:0:0: class=0x020000 card=0x02c615d9 chip=0x165914e4 rev=0x11 hdr=0x00
vendor = 'Broadcom Corporation'
device = 'BCM5750A1 NetXtreme Gigabit Ethernet PCI Express'
class = network
subclass = ethernet
bge1 at pci5:0:0: class=0x020000 card=0x02c615d9 chip=0x165914e4 rev=0x11 hdr=0x00
vendor = 'Broadcom Corporation'
device = 'BCM5750A1 NetXtreme Gigabit Ethernet PCI Express'
class = network
subclass = ethernet
[jdc at eos ~]$ vmstat -i
interrupt total rate
irq4: sio0 6 0
irq6: fdc0 14 0
irq14: ata0 520782 0
irq15: ata1 58 0
irq18: bge0 21839717 11
irq19: bge1+ 32914 0
cpu0: timer 3638265059 1968
Total 3660658550 1981
[jdc at eos ~]$ netstat -in
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
bge0 1500 <Link#1> 00:30:48:81:fc:8a 13841423 0 10349370 2 0
bge0 1500 72.20.106/25 72.20.106.2 3590195 - 10348720 - -
bge0 1500 72.20.106.3/3 72.20.106.3 2075045 - 0 - -
bge0 1500 72.20.106.4/3 72.20.106.4 2003973 - 0 - -
bge0 1500 72.20.106.5/3 72.20.106.5 2328549 - 0 - -
bge0 1500 72.20.106.6/3 72.20.106.6 2006174 - 0 - -
bge1 1500 <Link#2> 00:30:48:81:fc:8b 3888 0 29600 0 0
bge1 1500 10 10.72.0.1 2605 - 2605 - -
lo0 16384 <Link#3> 641 0 641 0 0
lo0 16384 127 127.0.0.1 641 - 641 - -
bridg 1500 <Link#4> 86:ec:97:73:50:03 26993 0 30885 0 0
tap0 1500 <Link#5> 00:bd:ed:13:00:00 25712 0 1286 0 0
If a developer wants access to this box, I can provide it. No serial
console at this time (soon, soon...), but can provide root.
--
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, USA |
| Making life hard for others since 1977. PGP: 4BD6C0CB |
More information about the freebsd-stable
mailing list