lagg Interfaces - don't do Gratuitous ARP?
    Steven Hartland 
    killing at multiplay.co.uk
       
    Thu Sep 22 08:04:27 UTC 2016
    
    
  
On 22/09/2016 00:57, Gleb Smirnoff wrote:
> On Sat, Sep 10, 2016 at 10:51:36PM +1000, Kubilay Kocak wrote:
> K> > <killing at multiplay.co.uk> wrote:
> K> >
> K> >> Yes known issue I'm afraid.
> K> >>
> K> >> I created a patch set to address this but there where objections so
> K> >> it was removed, see the attached which is based on 10.2-RELEASE.
> K> >
> K> > Hi,
> K> >
> K> > Thanks for the reply, and the comprehensive patch. If I get a chance
> K> > I'll see if I can run it up one of the affected boxes, if I can find
> K> > one I can mess around with.
> K> >
> K> > Good to know it wasn't just "me" :)
> K> >
> K> > Cheers,
> K> >
> K> > -Karl
> K>
> K> Also see the following review, which was re-opened (after original
> K> commit was reverted) after said issues were raised, though I can't see
> K> that glebius has commented on it since:
> K>
> K> https://reviews.freebsd.org/D4111
> K>
> K> 11.0 having this would have been awesome. Maybe (hopefully) 11.1
>
> IMHO, the original patch was absolutely evil hack touching multiple
> layers, for the sake of a very special problem.
>
> I think, that in order to kick forwarding table on switches, lagg
> should:
>
> - allocate an mbuf itself
> - set its source hardware address to its own
> - set destination hardware to broadcast
> - put some payload in there, to make packet of valid size. Why should it be
>    gratuitous ARP? A machine can be running IPv6 only, or may even use whatever
>    higher level protocol, e.g. PPPoE. We shouldn't involve IP into this Layer 2
>    problem at all.
> - Finally, send the prepared mbuf down the lagg member(s).
>
> And please don't hack half of the network stack to achieve that :)
>
Yes it does touch multiple layers, but I'm not sure I'd agree that was 
all that evil, we already have similar parts in the code base e.g. 
lle_event and carp link state.
When I dug around this issue last time the various papers made it quite 
clear what was expected / required to make lagg work properly, which is 
what was achieved by the code.
With regards to running IPv6 that doesn't deal with it either hence the 
nd6 changes included, which was the only grey area caused by a conflict 
between the letter of the IPv6 spec and achieving the requirement of 
fast recovery.
As a point of reference we've been running with the changes for nearly a 
year now and have never had issue, so the code is there if people want it.
The disappointing thing about this is we had a solution, all be it one 
not everyone liked, nearly a year ago now and yet here we are still 
stuck with a broken lagg implementation in the tree.
A perfect solution is always nice but in lieu of that the pragmatist in 
me asks isn't working better than not working at the end of the day?
     Regards
     Steve
    
    
More information about the freebsd-net
mailing list