lagg/lacp poor traffic distribution
Eugene Grosbein
egrosbein at rdtc.ru
Tue Dec 21 10:39:26 UTC 2010
On 20.12.2010 17:21, Shtorm wrote:
> On Sun, 2010-12-19 at 00:35 +0600, Eugene Grosbein wrote:
>> Hi!
>>
>> I've loaded router using two lagg interfaces in LACP mode.
>> lagg0 has IP address and two ports (em0 and em1) and carry untagged frames.
>> lagg1 has no IP address and has two ports (igb0 and igb1) and carry
>> about 1000 dot-q vlans with lots of hosts in each vlan.
>>
>> For lagg1, lagg distributes outgoing traffic over two ports just fine.
>> For lagg0 (untagged ethernet segment with only 2 MAC addresses)
>> less than 0.07% (54Mbit/s max) of traffic goes to em0
>> and over 99.92% goes to em1, that's bad.
>>
>> That's general traffic of several thousands of customers surfing the web,
>> using torrents etc. I've glanced over lagg/lacp sources if src/sys/net/
>> and found nothing suspicious, it should extract and use srcIP/dstIP for hash.
>>
>> How do I debug this problem?
>>
>> Eugene Grosbein
>> _______________________________________________
>> freebsd-net at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe at freebsd.org"
>
> I had this problem with igb driver, and I found, that lagg selects
> outgoing interface based on packet header flowid field if M_FLOWID field
> is set. And in the igb driver code flowid is set as
>
> #if __FreeBSD_version >= 800000
> <------><------><------>rxr->fmp->m_pkthdr.flowid = que->msix;
> <------><------><------>rxr->fmp->m_flags |= M_FLOWID;
> #endif
>
> The same thing in em driver with MULTIQUEUE
>
> That does not give enough number of flows to balance traffic well, so I
> commented check in if_lagg.c
>
> lagg_lb_start(struct lagg_softc *sc, struct mbuf *m)
> {
> <------>struct lagg_lb *lb = (struct lagg_lb *)sc->sc_psc;
> <------>struct lagg_port *lp = NULL;
> <------>uint32_t p = 0;
>
> //<---->if (m->m_flags & M_FLOWID)
> //<----><------>p = m->m_pkthdr.flowid;
> //<---->else
>
> and with this change I have much better load distribution across interfaces.
>
> Hope it helps.
You are perfectly right. By disabling flow usage I've obtained load sharing
close to even (final patch follows). Two questions:
1. Is it a bug or design problem?
2. Will I get problems like packet reordering by permanently disabling
usage of these flows in lagg(4)?
--- if_lagg.c.orig 2010-12-20 22:53:21.000000000 +0600
+++ if_lagg.c 2010-12-21 13:37:20.000000000 +0600
@@ -168,6 +168,11 @@
&lagg_failover_rx_all, 0,
"Accept input from any interface in a failover lagg");
+int lagg_use_flows = 1;
+SYSCTL_INT(_net_link_lagg, OID_AUTO, use_flows, CTLFLAG_RW,
+ &lagg_use_flows, 1,
+ "Use flows for load sharing");
+
static int
lagg_modevent(module_t mod, int type, void *data)
{
@@ -1666,7 +1671,7 @@
struct lagg_port *lp = NULL;
uint32_t p = 0;
- if (m->m_flags & M_FLOWID)
+ if (lagg_use_flows && (m->m_flags & M_FLOWID))
p = m->m_pkthdr.flowid;
else
p = lagg_hashmbuf(m, lb->lb_key);
--- if_lagg.h.orig 2010-12-21 16:34:35.000000000 +0600
+++ if_lagg.h 2010-12-21 16:35:27.000000000 +0600
@@ -242,6 +242,8 @@
int lagg_enqueue(struct ifnet *, struct mbuf *);
uint32_t lagg_hashmbuf(struct mbuf *, uint32_t);
+extern int lagg_use_flows;
+
#endif /* _KERNEL */
#endif /* _NET_LAGG_H */
--- ieee8023ad_lacp.c.orig 2010-12-21 16:36:09.000000000 +0600
+++ ieee8023ad_lacp.c 2010-12-21 16:35:58.000000000 +0600
@@ -812,7 +812,7 @@
return (NULL);
}
- if (m->m_flags & M_FLOWID)
+ if (lagg_use_flows && (m->m_flags & M_FLOWID))
hash = m->m_pkthdr.flowid;
else
hash = lagg_hashmbuf(m, lsc->lsc_hashkey);
Eugene Grosbein
More information about the freebsd-net
mailing list