lacp lagg port flags do not show correctly resulting in poor traffic distribution/performance

Jason Hellenthal jhellenthal at dataix.net
Tue Jul 10 07:10:16 UTC 2012



On Mon, Jul 09, 2012 at 05:38:24PM -0700, Adarsh Joshi wrote:
> Hi,
> 
> I am trying to configure lacp lagg interfaces with 2 systems connected back to back as follows:
> 
> Ifconfig lagg0 create
> Ifconfig lagg0 laggproto lacp laggport ql0 laggport ql1 192.168.100.1 netmask 255.255.255.0
> 
> Sometimes, the lag interface comes up correctly but sometimes the laggport flags do not show properly. Instead of 1c<ACTIVE,COLLECTING,DISTRIBUTING>, it shows values of 18. I have seen similar issues reported on various forums with no solution.
> Looking at the lagg driver code and reading the standard, I thought the laggport flags ( defined in if_lagg.h) are based on the LACP_STATE_BITS in file ieee8023ad_lacp.h. But the following ifconfig -v output does not make any sense to me.
> 
> My concern is that when all the interfaces show flags as 1c, the traffic is distributed across both the interfaces uniformly and I get aggregated throughput. If not, the traffic flows only on 1 interface.
> 
> Is this a bug? How do I solve this? Or am I doing something wrong?
> 
> I am using Free-BSD 9.0 release.
> 
> System 1:
> # ifconfig -v lagg0
> lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>         options=13b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,TSO4>
>         ether 00:0e:1e:08:05:20
>         inet 192.168.100.1 netmask 0xffffff00 broadcast 192.168.100.255
>         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>         media: Ethernet autoselect
>         status: active
>         groups: lagg
>         laggproto lacp
>         lag id: [(8000,00-0E-1E-08-05-20,0213,0000,0000),
>                  (8000,00-0E-1E-04-2C-F0,0213,0000,0000)]
>         laggport: ql1 flags=18<COLLECTING,DISTRIBUTING> state=7D
>                 [(8000,00-0E-1E-08-05-20,0213,8000,000F),
>                  (FFFF,00-00-00-00-00-00,0000,FFFF,0000)]
>         laggport: ql0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=3D
>                 [(8000,00-0E-1E-08-05-20,0213,8000,000E),
>                  (8000,00-0E-1E-04-2C-F0,0213,8000,000E)]
> 
> System 2:
> 
> # ifconfig -v lagg0
> lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>         options=13b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,TSO4>
>         ether 00:0e:1e:04:2c:f0
>         inet 192.168.100.2 netmask 0xffffff00 broadcast 192.168.100.255
>         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>         media: Ethernet autoselect
>         status: active
>         groups: lagg
>         laggproto lacp
>         lag id: [(8000,00-0E-1E-04-2C-F0,0213,0000,0000),
>                  (FFFF,00-00-00-00-00-00,0000,0000,0000)]
>         laggport: ql1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=7D
>                [(8000,00-0E-1E-04-2C-F0,0213,8000,000F),
>                  (FFFF,00-00-00-00-00-00,0000,FFFF,0000)]
>         laggport: ql0 flags=18<COLLECTING,DISTRIBUTING> state=3D
>                 [(8000,00-0E-1E-04-2C-F0,0213,8000,000E),
>                  (8000,00-0E-1E-08-05-20,0213,8000,000E)]
> 
> 

Just for reference ... (stable/8 @ r238264)

lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=80048<VLAN_MTU,POLLING,LINKSTATE>
        ether 00:0c:41:21:1d:b5
        inet 192.168.XX.X netmask 0xffffff00 broadcast 192.168.XX.XXX
	media: Ethernet autoselect
        status: active
        groups: lagg 
        laggproto lacp lagghash l2,l3,l4
        lag id: [(8000,00-0C-41-21-1D-B5,00E6,0000,0000),
                 (FFFF,00-00-00-00-00-00,0000,0000,0000)]
        laggport: dc1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=7D
                [(8000,00-0C-41-21-1D-B5,00E6,8000,0002),
                 (FFFF,00-00-00-00-00-00,0000,FFFF,0000)]
        laggport: dc0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=7D
                [(8000,00-0C-41-21-1D-B5,00E6,8000,0001),
                 (FFFF,00-00-00-00-00-00,0000,FFFF,0000)]


They have had flags = 1c for quite some time and state = 7D

And just to show the variation ...


 dc0:
     yesterday      8.53 MiB  /    2.61 MiB  /   11.14 MiB
         today       693 KiB  /     156 KiB  /     849 KiB

 dc1:
     yesterday     19.00 MiB  /    1.79 MiB  /   20.78 MiB
         today       496 KiB  /     103 KiB  /     599 KiB

 lagg0:
     yesterday     27.53 MiB  /    3.71 MiB  /   31.24 MiB
         today      1.16 MiB  /     172 KiB  /    1.33 MiB


I believe (know) there has been some changes in the LAgg code in
stable/9 and stable/8 recently so you may want to check into that.

Given this is LAgg and LACP you will see some variation regardless but I
recall a point that it seemed like one interface was being favored over
the other quite repeatedly or obsessively that had me second guessing
whether it was doing the right thing.

LACP in Cisco is quite different than how we treat it here in FreeBSD as
it tends to use the interfaces quite evenly all the time so that also
has me second guessing whether the right thing is happening here. ( in
PAgP and LACP modes ).

These tunables (sysctl)'s may also lead you in direction you want to
look in.

net.link.lagg.default_use_flowid: 1
net.link.lagg.failover_rx_all: 0
net.link.lagg.0.use_flowid: 1

-rxcsum & -txcsum for lo0 and ql0 ql1 might be of benefit too though I
only turn them off on lo0.


Good luck

-- 

 - (2^(N-1))
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 455 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20120710/aa32365b/attachment.pgp


More information about the freebsd-net mailing list