[Bug 234242] LACP l2,l3,l4 load sharing only respects dst port

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Fri Dec 21 10:20:14 UTC 2018


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=234242

            Bug ID: 234242
           Summary: LACP l2,l3,l4 load sharing only respects dst port
           Product: Base System
           Version: 11.2-STABLE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Many People
          Priority: ---
         Component: kern
          Assignee: bugs at FreeBSD.org
          Reporter: m.muenz at gmail.com

Hi,

I'm chasing a performance problem when using FreeBSD as a router. 
The system has 4 Mellanox ConnectX-3 cards. Bonded to 2 LAGGs with LACP.

mlxen0 and 1 are lagg0 and mlxen2 and 3 are lagg1.

Directly on these interface are linux boxes also with ConnectX-3 cards.

This is the ifconfig from the router:


mlxen0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
       
options=8d00b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE>
        ether 24:8a:07:f7:5b:30
        hwaddr 24:8a:07:f7:5b:30
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect (10Gbase-CX4 <full-duplex,rxpause,txpause>)
        status: active
mlxen1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
       
options=8d00b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE>
        ether 24:8a:07:f7:5b:30
        hwaddr 24:8a:07:f7:5b:31
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect (10Gbase-CX4 <full-duplex,rxpause,txpause>)
        status: active
mlxen2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
       
options=8d00b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE>
        ether 24:8a:07:f7:5f:10
        hwaddr 24:8a:07:f7:5f:10
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect (10Gbase-CX4 <full-duplex,rxpause,txpause>)
        status: active
mlxen3: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
       
options=8d00b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE>
        ether 24:8a:07:f7:5f:10
        hwaddr 24:8a:07:f7:5f:11
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect (10Gbase-CX4 <full-duplex,rxpause,txpause>)
        status: active
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
       
options=8d00b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE>
        ether 24:8a:07:f7:5b:30
        inet6 fe80::268a:7ff:fef7:5b30%lagg0 prefixlen 64 scopeid 0xb
        inet 10.22.1.1 netmask 0xffffff00 broadcast 10.22.1.255
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect
        status: active
        groups: lagg
        laggproto lacp lagghash l2,l3,l4
        laggport: mlxen0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
        laggport: mlxen1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
lagg1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
       
options=8d00b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE>
        ether 24:8a:07:f7:5f:10
        inet6 fe80::268a:7ff:fef7:5f10%lagg1 prefixlen 64 scopeid 0xc
        inet 10.22.2.1 netmask 0xffffff00 broadcast 10.22.2.255
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        media: Ethernet autoselect
        status: active
        groups: lagg
        laggproto lacp lagghash l2,l3,l4
        laggport: mlxen2 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
        laggport: mlxen3 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>


When I put the 2 linux boxes directly together I can easily achive 20GBits.

iperf server: iperf3 -V -p 5000 -f m -s
iperf client: iperf3 -p 5000 -f m -V -c 10.22.2.10 -t 30 -P 10

When I put the BSD router between I can only achieve 10Gbit. 

>From the man page of ifconfig there states:

     lagghash option[,option]
             Set the packet layers to hash for aggregation protocols which
             load balance.  The default is "l2,l3,l4".  The options can be
             combined using commas.

             l2      src/dst mac address and optional vlan number.
             l3      src/dst address for IPv4 or IPv6.
             l4      src/dst port for TCP/UDP/SCTP.

The problem is that l4 is not really true, because iperf on multistream (-P 10)
uses multiple source ports, but the load comes from lagg0 with 5Gbit divided on
mlxen0 and mlxen1 and goes via lagg1 through one of the interfaces with 10Gbit. 

If I start a second iperf instance, listening on a different port and start
both, the traffic flows with 20Gbit correctly shared. 


Searching bugtracker, forums and asking IRC doesn't gave any good answer,
already played with lacp strict mode and enable/disable flowid does not help. 

Right now I'm not sure if this is a bug in kernel or documentation, but it
would be cool if we can include src and dst ports in hashing calculation.


Thanks
Michael

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list