SunFire X2200 ilo's bge1 DOWN/UP
Daniel Braniss
danny at cs.huji.ac.il
Wed May 29 13:02:03 UTC 2013
>
> --/04w6evG8XlLl3ft
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
>
> On Tue, May 28, 2013 at 09:55:24AM +0300, Daniel Braniss wrote:
> > > On Tue, May 28, 2013 at 09:28:00AM +0300, Daniel Braniss wrote:
> > > > > On Mon, May 27, 2013 at 10:59:28AM +0300, Daniel Braniss wrote:
> > > > > > > On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote:
> > > > > > > > hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200,
> > > > > > >
> > > > > > > Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output.
> > > > > > >
> > > > > >
> > > > > > bge0: <Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x009003> mem
> > > > > > 0xfdff0000-0xfdffffff,0xfdfe0000-0xfdfeffff irq 17 at device 4.0 on pci6
> > > > > > bge0: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz
> > > > > > miibus2: <MII bus> on bge0
> > > > > > brgphy0: <BCM5714 1000BASE-T media interface> PHY 1 on miibus2
> > > > > > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
> > > > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
> > > > > > bge0: Ethernet address: 00:1b:24:5d:5b:bd
> > > > > > bge1: <Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x009003> mem
> > > > > > 0xfdfc0000-0xfdfcffff,0xfdfb0000-0xfdfbffff irq 18 at device 4.1 on pci6
> > > > > > bge1: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz
> > > > > > miibus3: <MII bus> on bge1
> > > > > > brgphy1: <BCM5714 1000BASE-T media interface> PHY 1 on miibus3
> > > > > > brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
> > > > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
> > > > > > bge1: Ethernet address: 00:1b:24:5d:5b:be
> > > > > >
> > > > > > sf-10> ifconfig bge1
> > > > > > bge1: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
> > > > > > options=8009b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTA
> > > > > > TE>
> > > > > > ether 00:1b:24:5d:5b:be
> > > > > > nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
> > > > > > media: Ethernet autoselect (100baseTX <full-duplex>)
> > > > > > status: active
> > > > > >
> > > > >
> > > > > Because bge1 is not UP, I wonder how you get link UP/DOWN events.
> > > > > Do you have some network script run by cron?
> > > >
> > > > no scripts.
> > > > this port is shared with the ILO/IPMI, and back in March you fixed a problem
> > > > that it was hanging soon after it was initialized by the driver,
> > > > (r248226 - but I'm not sure if it was ever MFC'ed).
> > >
> > > It was MFCed.
> > >
> > > > Initialy I thought it could be caused by connections to it from other
> > > > hosts (either via the web, or ssh) so I killed them, but it didn't help.
> > > > without that patch the connection fails, and I don't see any DOWN/UP.
> > >
> > > Could you check how many number of interrupts you get from bge1?
> > > Ideally you shouldn't get any interrupts for bge1.
> >
> > it's not even mentioned :-)
> > sf-04> vmstat -i
> > interrupt total rate
> > irq3: uart1 964 0
> > irq4: uart0 6 0
> > irq14: ata0 227354 0
> > irq17: bge0 1021981 2
> > irq21: ohci0 28 0
> > irq22: ehci0 2 0
> > irq23: atapci1 293228 0
> > cpu0:timer 383244076 1124
> > cpu1:timer 2225144 6
> > cpu2:timer 2056087 6
> > cpu3:timer 2093943 6
> > Total 391162813 1147
> >
>
> Then the only way link UP/DOWN event could be generated for DOWN
> interface would be invocation of media status query
> (i.e. ifconfig -a) triggered by an external application. Most
> drivers I touched check IFF_UP flag before poking media status
> register. However I'm not sure you're seeing this issue because you
> do not use any network script run by cron.
> Anyway, try attached patch and let me know whether it makes any
> difference.
>
> > >
> > > >
> > > > >
> > > > > > > > is toggeling bge1 DOWN/UP every few hours, this port is being used by the ILO.
> > > > > > > > To check, I upgraded another identical host, and the same problem appears.
> > > > > > >
> > > > > > > What is the last known working revision?
> > > > > >
> > > > > > I have no idea, but I have older versions, and ill start from the oldets
> > > > > > (9.1-prerelease), but
> > > > > > it will take time, since it takes hours till it happens.
> > > > > >
> > > > >
> > > > > ok.
> > > >
> > > >
> >
> >
>
> --/04w6evG8XlLl3ft
> Content-Type: text/x-diff; charset=us-ascii
> Content-Disposition: attachment; filename="bge.media_sts.diff"
>
> Index: sys/dev/bge/if_bge.c
> ===================================================================
> --- sys/dev/bge/if_bge.c (revision 251021)
> +++ sys/dev/bge/if_bge.c (working copy)
> @@ -5583,6 +5583,10 @@ bge_ifmedia_sts(struct ifnet *ifp, struct ifmediar
>
> BGE_LOCK(sc);
>
> + if ((ifp->if_flags & IFF_UP) == 0) {
> + BGE_UNLOCK(sc);
> + return;
> + }
> if (sc->bge_flags & BGE_FLAG_TBI) {
> ifmr->ifm_status = IFM_AVALID;
> ifmr->ifm_active = IFM_ETHER;
>
> --/04w6evG8XlLl3ft--
done, will let you know in 24hs.
thanks,
danny
More information about the freebsd-stable
mailing list