lacp lagg port flags do not show correctly resulting in poor traffic distribution/performance

Adarsh Joshi adarsh.joshi at qlogic.com
Tue Jul 10 17:15:41 UTC 2012


Jason,

Thanks for the reply.

Why do you say - " Given this is LAgg and LACP you will see some variation regardless "

Can you please throw some more light on this?
I understand if there is a slight variation in the throughput when LAGG and LACP is configured but in my case, I do not see a single packet on 1 of the interface except for LACPDUs and other LACP control packets.

Thanks
Adarsh

-----Original Message-----
From: Jason Hellenthal [mailto:jhellenthal at dataix.net]
Sent: Tuesday, July 10, 2012 12:10 AM
To: Adarsh Joshi
Cc: freebsd-net at freebsd.org
Subject: Re: lacp lagg port flags do not show correctly resulting in poor traffic distribution/performance



On Mon, Jul 09, 2012 at 05:38:24PM -0700, Adarsh Joshi wrote:
> Hi,
>
> I am trying to configure lacp lagg interfaces with 2 systems connected back to back as follows:
>
> Ifconfig lagg0 create
> Ifconfig lagg0 laggproto lacp laggport ql0 laggport ql1 192.168.100.1
> netmask 255.255.255.0
>
> Sometimes, the lag interface comes up correctly but sometimes the laggport flags do not show properly. Instead of 1c<ACTIVE,COLLECTING,DISTRIBUTING>, it shows values of 18. I have seen similar issues reported on various forums with no solution.
> Looking at the lagg driver code and reading the standard, I thought the laggport flags ( defined in if_lagg.h) are based on the LACP_STATE_BITS in file ieee8023ad_lacp.h. But the following ifconfig -v output does not make any sense to me.
>
> My concern is that when all the interfaces show flags as 1c, the traffic is distributed across both the interfaces uniformly and I get aggregated throughput. If not, the traffic flows only on 1 interface.
>
> Is this a bug? How do I solve this? Or am I doing something wrong?
>
> I am using Free-BSD 9.0 release.
>
> System 1:
> # ifconfig -v lagg0
> lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>         options=13b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,TSO4>
>         ether 00:0e:1e:08:05:20
>         inet 192.168.100.1 netmask 0xffffff00 broadcast 192.168.100.255
>         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>         media: Ethernet autoselect
>         status: active
>         groups: lagg
>         laggproto lacp
>         lag id: [(8000,00-0E-1E-08-05-20,0213,0000,0000),
>                  (8000,00-0E-1E-04-2C-F0,0213,0000,0000)]
>         laggport: ql1 flags=18<COLLECTING,DISTRIBUTING> state=7D
>                 [(8000,00-0E-1E-08-05-20,0213,8000,000F),
>                  (FFFF,00-00-00-00-00-00,0000,FFFF,0000)]
>         laggport: ql0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=3D
>                 [(8000,00-0E-1E-08-05-20,0213,8000,000E),
>                  (8000,00-0E-1E-04-2C-F0,0213,8000,000E)]
>
> System 2:
>
> # ifconfig -v lagg0
> lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>         options=13b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,TSO4>
>         ether 00:0e:1e:04:2c:f0
>         inet 192.168.100.2 netmask 0xffffff00 broadcast 192.168.100.255
>         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>         media: Ethernet autoselect
>         status: active
>         groups: lagg
>         laggproto lacp
>         lag id: [(8000,00-0E-1E-04-2C-F0,0213,0000,0000),
>                  (FFFF,00-00-00-00-00-00,0000,0000,0000)]
>         laggport: ql1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=7D
>                [(8000,00-0E-1E-04-2C-F0,0213,8000,000F),
>                  (FFFF,00-00-00-00-00-00,0000,FFFF,0000)]
>         laggport: ql0 flags=18<COLLECTING,DISTRIBUTING> state=3D
>                 [(8000,00-0E-1E-04-2C-F0,0213,8000,000E),
>                  (8000,00-0E-1E-08-05-20,0213,8000,000E)]
>
>

Just for reference ... (stable/8 @ r238264)

lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=80048<VLAN_MTU,POLLING,LINKSTATE>
        ether 00:0c:41:21:1d:b5
        inet 192.168.XX.X netmask 0xffffff00 broadcast 192.168.XX.XXX
        media: Ethernet autoselect
        status: active
        groups: lagg
        laggproto lacp lagghash l2,l3,l4
        lag id: [(8000,00-0C-41-21-1D-B5,00E6,0000,0000),
                 (FFFF,00-00-00-00-00-00,0000,0000,0000)]
        laggport: dc1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=7D
                [(8000,00-0C-41-21-1D-B5,00E6,8000,0002),
                 (FFFF,00-00-00-00-00-00,0000,FFFF,0000)]
        laggport: dc0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING> state=7D
                [(8000,00-0C-41-21-1D-B5,00E6,8000,0001),
                 (FFFF,00-00-00-00-00-00,0000,FFFF,0000)]


They have had flags = 1c for quite some time and state = 7D

And just to show the variation ...


 dc0:
     yesterday      8.53 MiB  /    2.61 MiB  /   11.14 MiB
         today       693 KiB  /     156 KiB  /     849 KiB

 dc1:
     yesterday     19.00 MiB  /    1.79 MiB  /   20.78 MiB
         today       496 KiB  /     103 KiB  /     599 KiB

 lagg0:
     yesterday     27.53 MiB  /    3.71 MiB  /   31.24 MiB
         today      1.16 MiB  /     172 KiB  /    1.33 MiB


I believe (know) there has been some changes in the LAgg code in
stable/9 and stable/8 recently so you may want to check into that.

Given this is LAgg and LACP you will see some variation regardless but I recall a point that it seemed like one interface was being favored over the other quite repeatedly or obsessively that had me second guessing whether it was doing the right thing.

LACP in Cisco is quite different than how we treat it here in FreeBSD as it tends to use the interfaces quite evenly all the time so that also has me second guessing whether the right thing is happening here. ( in PAgP and LACP modes ).

These tunables (sysctl)'s may also lead you in direction you want to look in.

net.link.lagg.default_use_flowid: 1
net.link.lagg.failover_rx_all: 0
net.link.lagg.0.use_flowid: 1

-rxcsum & -txcsum for lo0 and ql0 ql1 might be of benefit too though I only turn them off on lo0.


Good luck

--

 - (2^(N-1))

This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.



More information about the freebsd-net mailing list