[Bug 191786] New: bce link state changes to same state are ignored by lagg

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Thu Jul 10 17:03:01 UTC 2014


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=191786

            Bug ID: 191786
           Summary: bce link state changes to same state are ignored by
                    lagg
           Product: Base System
           Version: 9.2-RELEASE
          Hardware: amd64
                OS: Any
            Status: Needs Triage
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: freebsd-bugs at FreeBSD.org
          Reporter: vegeta at tuxpowered.net

I have a Dell PowerEdge m610 system with 6 bce NICs. BladeCenter is equipped
with M6348 switches in fabric A and pass-through modules in fabrics B and C.
bce[01] are connected to switch stack via backplane and form a lagg with LACP,
bce[23] are connected via pass-through module to copper ports in the same
switch stack and create another lagg with LACP. bce[45] are connected elsewhere
without LAG.

As far as I understand an interface can change its state in lagg when it goes
down or up, thanks to if_link_state_change and do_link_state_change in
sys/net/if.c. if_link_state_change is equipped with a check for "changing" into
the same state.

I've added following printfs to see what exactly happens:

--- a/sys/dev/bce/if_bce.c
+++ b/sys/dev/bce/if_bce.c
@@ -6460,11 +6460,23 @@ bce_phy_intr(struct bce_softc *sc)
                        if (new_link_state) {
                                if (bootverbose)
                                        if_printf(sc->bce_ifp, "link UP\n");
+                               printf("%s: state change to UP; from 0x%x to
0x%x, calling if_link_state_change(0x%x, 0x%x)\n",
+                                               sc->bce_ifp->if_xname,
+                                               old_link_state,
+                                               new_link_state,
+                                               sc->bce_ifp->if_link_state,
+                                               LINK_STATE_UP);
                                if_link_state_change(sc->bce_ifp,
                                    LINK_STATE_UP);
                        } else {
                                if (bootverbose)
                                        if_printf(sc->bce_ifp, "link DOWN\n");
+                               printf("%s: state change to DOWN; from 0x%x to
0x%x, calling if_link_state_change(0x%x, 0x%x)\n",
+                                               sc->bce_ifp->if_xname,
+                                               old_link_state,
+                                               new_link_state,
+                                               sc->bce_ifp->if_link_state,
+                                               LINK_STATE_DOWN);
                                if_link_state_change(sc->bce_ifp,
                                    LINK_STATE_DOWN);
                        }
--- a/sys/net/if.c
+++ b/sys/net/if.c
@@ -1900,9 +1900,12 @@ void     *(*vlan_cookie_p)(struct ifnet *);
 void
 if_link_state_change(struct ifnet *ifp, int link_state)
 {
+       log(LOG_NOTICE, "%s: link state changed to %s, scheduling
do_link_state_change\n", ifp->if_xname, (link_state == LINK_STATE_UP) ? "UP" :
"DOWN" );
        /* Return if state hasn't changed. */
-       if (ifp->if_link_state == link_state)
+       if (ifp->if_link_state == link_state) {
+               log(LOG_NOTICE, "%s: not really changed, skipping\n",
ifp->if_xname);
                return;
+       }

        ifp->if_link_state = link_state;

@@ -1921,6 +1924,8 @@ do_link_state_change(void *arg, int pending)
        if (ifp->if_vlantrunk != NULL)
                (*vlan_link_state_p)(ifp);

+       log(LOG_NOTICE, "%s: link state changed to %s, calling hooks\n",
ifp->if_xname, (link_state == LINK_STATE_UP) ? "UP" : "DOWN" );
+
        if ((ifp->if_type == IFT_ETHER || ifp->if_type == IFT_L2VLAN) &&
            IFP2AC(ifp)->ac_netgraph != NULL)
                (*ng_ether_link_state_p)(ifp, link_state);
@@ -1928,8 +1933,10 @@ do_link_state_change(void *arg, int pending)
                (*carp_linkstate_p)(ifp);
        if (ifp->if_bridge)
                (*bridge_linkstate_p)(ifp);
-       if (ifp->if_lagg)
+       if (ifp->if_lagg) {
+               log(LOG_NOTICE, "%s: hook link_state\n", ifp->if_xname);
                (*lagg_linkstate_p)(ifp, link_state);
+       }

        if (IS_DEFAULT_VNET(curvnet))
                devctl_notify("IFNET", ifp->if_xname,


As howtos and manuals suggest, my rc.conf contains ifconfig_bce[01234]="up",
this causes the following thing to be logged:

Jul 10 17:18:55 aw19lb2 kernel: bce0: state change to UP; from 0x0 to 0x1,
calling if_link_state_change(0x0, 0x2)
Jul 10 17:18:55 aw19lb2 kernel: bce1: state change to UP; from 0x0 to 0x1,
calling if_link_state_change(0x0, 0x2)
Jul 10 17:18:55 aw19lb2 kernel: bce2: state change to UP; from 0x0 to 0x1,
calling if_link_state_change(0x0, 0x2)
Jul 10 17:18:55 aw19lb2 kernel: bce3: state change to UP; from 0x0 to 0x1,
calling if_link_state_change(0x0, 0x2)


Later on laggs get their ports added:

ifconfig_lagg1_name="internal"
ifconfig_internal="laggproto lacp laggport bce2 laggport bce3"

This causes following lacp debug messages:

Jul 10 17:18:55 aw19lb2 kernel: bce2: media changed 0x0 -> 0x22, ether = 1, fdx
= 0, link = 1
Jul 10 17:18:55 aw19lb2 kernel: bce2: partner timeout changed
Jul 10 17:18:55 aw19lb2 kernel: bce2: LACP_STATE_AGGREGATION==0, key is 8003
Jul 10 17:18:55 aw19lb2 kernel: bce2: -> UNSELECTED

Jul 10 17:18:55 aw19lb2 kernel: bce3: media changed 0x0 -> 0x22, ether = 1, fdx
= 0, link = 1
Jul 10 17:18:55 aw19lb2 kernel: bce3: partner timeout changed
Jul 10 17:18:55 aw19lb2 kernel: bce3: LACP_STATE_AGGREGATION==0, key is 8004
Jul 10 17:18:55 aw19lb2 kernel: bce3: -> UNSELECTED

Ports are added, but they are missing Full Duplex for some reason, so they
don't participate in lagg yet. This gets fixed later on, but not for bce3:

Jul 10 17:18:57 aw19lb2 kernel: bce3: state change to UP; from 0x0 to 0x1,
calling if_link_state_change(0x2, 0x2)
Jul 10 17:18:57 aw19lb2 kernel: bce3: link state changed to UP, scheduling
do_link_state_change
Jul 10 17:18:57 aw19lb2 kernel: bce3: not really changed, skipping

bce3 performs change from down to up, but there was never change to down, so
if_link_state_change ignores the change and lagg stuff is never called.

For bce2 this looks different, there is a real change to down, and then up:

Jul 10 17:18:59 aw19lb2 kernel: bce2: state change to DOWN; from 0x1 to 0x0,
calling if_link_state_change(0x2, 0x1)
Jul 10 17:18:59 aw19lb2 kernel: bce2: link state changed to DOWN, scheduling
do_link_state_change
Jul 10 17:18:59 aw19lb2 kernel: bce2: link state changed to DOWN, calling hooks
Jul 10 17:18:59 aw19lb2 kernel: bce2: hook link_state
Jul 10 17:18:59 aw19lb2 kernel: bce2: media changed 0x22 -> 0x100630, ether =
1, fdx = 1, link = 0
Jul 10 17:18:59 aw19lb2 kernel: bce2: LACP_STATE_AGGREGATION==0, key is 8003
Jul 10 17:18:59 aw19lb2 kernel: bce2: link state changed to DOWN
# end of do_link_state_change
Jul 10 17:18:59 aw19lb2 kernel: bce2: state change to UP; from 0x0 to 0x1,
calling if_link_state_change(0x1, 0x2)
Jul 10 17:18:59 aw19lb2 kernel: bce2: link state changed to UP, scheduling
do_link_state_change
Jul 10 17:18:59 aw19lb2 kernel: bce2: link state changed to UP, calling hooks
Jul 10 17:18:59 aw19lb2 kernel: bce2: hook link_state
Jul 10 17:18:59 aw19lb2 kernel: bce2: media changed 0x100630 -> 0x100630, ether
= 1, fdx = 1, link = 1
Jul 10 17:18:59 aw19lb2 kernel: bce2: key is 20b
Jul 10 17:18:59 aw19lb2 kernel: bce2: -> UNSELECTED
Jul 10 17:18:59 aw19lb2 kernel: bce2: link state changed to UP
Jul 10 17:18:59 aw19lb2 kernel: bce2: lacpdu receive
Jul 10 17:18:59 aw19lb2 kernel: bce2: lacp_sm_rx_update_ntt: assert ntt
Jul 10 17:18:59 aw19lb2 kernel: bce2: old pstate
38<SYNC,COLLECTING,DISTRIBUTING>
Jul 10 17:18:59 aw19lb2 kernel: bce2: new pstate
c5<ACTIVITY,AGGREGATION,DEFAULTED,EXPIRED>
Jul 10 17:18:59 aw19lb2 kernel: bce2: lacpdu transmit

In the meantime there are multiple messages "bce[23]: lacpdu receive", so both
links are really up and are receiving packets from switch.

My system is patched with patch from
http://lists.freebsd.org/pipermail/freebsd-net/2013-February/034649.html,
otherwise there are even more problems, as bce[01] are show to be connected to
different media, even though both of them are connected directly to switch via
Blade Centre's backplane. Generally so strict checking for media type forbids
creating a 4-port lagg for such network configuration as I have, where some
ports would use direct backplane connection and some extra copper cables, but
this is yet another topic...

[root at aw19lb2 ~]# ifconfig -m bce0
bce0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
       
options=800bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,LINKSTATE>
       
capabilities=800bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,LINKSTATE>
        ether c8:1f:66:9d:1b:88
        media: Ethernet autoselect (1000baseSX <full-duplex,rxpause,txpause>)
        status: active
        supported media:
                media autoselect
                media 1000baseSX mediaopt full-duplex
                media 1000baseSX
[root at aw19lb2 ~]# 
[root at aw19lb2 ~]# ifconfig -m bce1
bce1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
       
options=800bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,LINKSTATE>
       
capabilities=800bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,LINKSTATE>
        ether c8:1f:66:9d:1b:88
        media: Ethernet autoselect (1000baseT <full-duplex,rxpause,txpause>)
        status: active
        supported media:
                media autoselect
                media 1000baseSX mediaopt full-duplex
                media 1000baseSX

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list