[Bug 234242] LACP l2,l3,l4 load sharing only respects dst port
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Fri Dec 21 10:20:14 UTC 2018
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=234242
Bug ID: 234242
Summary: LACP l2,l3,l4 load sharing only respects dst port
Product: Base System
Version: 11.2-STABLE
Hardware: amd64
OS: Any
Status: New
Severity: Affects Many People
Priority: ---
Component: kern
Assignee: bugs at FreeBSD.org
Reporter: m.muenz at gmail.com
Hi,
I'm chasing a performance problem when using FreeBSD as a router.
The system has 4 Mellanox ConnectX-3 cards. Bonded to 2 LAGGs with LACP.
mlxen0 and 1 are lagg0 and mlxen2 and 3 are lagg1.
Directly on these interface are linux boxes also with ConnectX-3 cards.
This is the ifconfig from the router:
mlxen0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=8d00b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE>
ether 24:8a:07:f7:5b:30
hwaddr 24:8a:07:f7:5b:30
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (10Gbase-CX4 <full-duplex,rxpause,txpause>)
status: active
mlxen1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=8d00b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE>
ether 24:8a:07:f7:5b:30
hwaddr 24:8a:07:f7:5b:31
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (10Gbase-CX4 <full-duplex,rxpause,txpause>)
status: active
mlxen2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=8d00b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE>
ether 24:8a:07:f7:5f:10
hwaddr 24:8a:07:f7:5f:10
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (10Gbase-CX4 <full-duplex,rxpause,txpause>)
status: active
mlxen3: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=8d00b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE>
ether 24:8a:07:f7:5f:10
hwaddr 24:8a:07:f7:5f:11
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect (10Gbase-CX4 <full-duplex,rxpause,txpause>)
status: active
lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=8d00b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE>
ether 24:8a:07:f7:5b:30
inet6 fe80::268a:7ff:fef7:5b30%lagg0 prefixlen 64 scopeid 0xb
inet 10.22.1.1 netmask 0xffffff00 broadcast 10.22.1.255
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: active
groups: lagg
laggproto lacp lagghash l2,l3,l4
laggport: mlxen0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
laggport: mlxen1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
lagg1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=8d00b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE>
ether 24:8a:07:f7:5f:10
inet6 fe80::268a:7ff:fef7:5f10%lagg1 prefixlen 64 scopeid 0xc
inet 10.22.2.1 netmask 0xffffff00 broadcast 10.22.2.255
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
media: Ethernet autoselect
status: active
groups: lagg
laggproto lacp lagghash l2,l3,l4
laggport: mlxen2 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
laggport: mlxen3 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
When I put the 2 linux boxes directly together I can easily achive 20GBits.
iperf server: iperf3 -V -p 5000 -f m -s
iperf client: iperf3 -p 5000 -f m -V -c 10.22.2.10 -t 30 -P 10
When I put the BSD router between I can only achieve 10Gbit.
>From the man page of ifconfig there states:
lagghash option[,option]
Set the packet layers to hash for aggregation protocols which
load balance. The default is "l2,l3,l4". The options can be
combined using commas.
l2 src/dst mac address and optional vlan number.
l3 src/dst address for IPv4 or IPv6.
l4 src/dst port for TCP/UDP/SCTP.
The problem is that l4 is not really true, because iperf on multistream (-P 10)
uses multiple source ports, but the load comes from lagg0 with 5Gbit divided on
mlxen0 and mlxen1 and goes via lagg1 through one of the interfaces with 10Gbit.
If I start a second iperf instance, listening on a different port and start
both, the traffic flows with 20Gbit correctly shared.
Searching bugtracker, forums and asking IRC doesn't gave any good answer,
already played with lacp strict mode and enable/disable flowid does not help.
Right now I'm not sure if this is a bug in kernel or documentation, but it
would be cool if we can include src and dst ports in hashing calculation.
Thanks
Michael
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-bugs
mailing list