kern/185967: Link Aggregation LAGG: LACP not working in 10.0

Kubilay Kocak koobs.freebsd at gmail.com
Mon Feb 3 10:29:20 UTC 2014


On 3/02/2014 9:20 PM, Ben wrote:
> Hi,
> 
> It was Juniper's active/passive mode regarding LACP.
> 
> It was set to passive and worked as you described without sending any
> packages. Now it was set to active and works perfectly again.
> 
> I couldn't try your patch easily as I didn't have the sources installed
> (and obviously no network connection).
> 
> If the time allows I will try your patch anyway.

It would be *great* if you could do that Ben :) Having a successful
real-world test case will provide Scott the confidence to land a commit
and merge it back to stable/10 so that everyone can benefit as soon as
possible.

> Thanks for your help!
> 
> Regards
> Ben
> 
> On 03.02.2014 10:58, Scott Long wrote:
>> Hi,
>>
>> If you can, please test the patch I sent and let me know the results. 
>> I’ll check it into FreeBSD 11 and 10 if it works for you.
>>
>> Thanks,
>> Scott
>>
>> On Feb 3, 2014, at 2:51 AM, Ben <mailinglists at niessen.ch> wrote:
>>
>>> Thank you for your detailed explanation.
>>>
>>> If I understand correctly the switch is probably not set up
>>> correctly, right?
>>>
>>> I will try to have it configured correctly first.
>>>
>>> Thanks a lot for your help!
>>>
>>> Regards
>>> Ben
>>>
>>> On 03.02.2014 10:45, Scott Long wrote:
>>>> Ok, please try the patch I emailed earlier.  Since you’re not seeing
>>>> any receive messages, it means that your switch isn’t generating any
>>>> LACP heartbeats.  The difference between FreeBSD 9.x and 10 is that
>>>> in 9.x, it ran in “optimistic” mode, meaning that it didn’t rely on
>>>> getting receive messages from the switch, and only took a channel
>>>> down if the link state went down.  In strict mode, it looks for the
>>>> receive messages and only transitions to a full operational state if
>>>> it gets them.  So while I know it’s easy to point at the problem
>>>> being FreeBSD 10, seeing as FreeBSD 9 worked for you, please check
>>>> to make sure that your switch is set up correctly.
>>>>
>>>> I authored the original change that went into FreeBSD 10, and I
>>>> tried to make it so that strict_mode=0 would keep everything working
>>>> as it did in 9.  I guess that since you’re getting no receive
>>>> messages from the switch at all that we need to disable strict mode
>>>> on setup, not afterwards.  Apply the patch and everything should
>>>> work as it did in FreeBSD 9.
>>>>
>>>> Scott
>>>>
>>>> On Feb 3, 2014, at 2:29 AM, Ben <mailinglists at niessen.ch> wrote:
>>>>
>>>>> Yes, via sysctl and /etc/sysctl.conf
>>>>>
>>>>> I waited now roughly 20 minutes without touching it but no difference.
>>>>>
>>>>> No, I only see these transmit messages, no receive.
>>>>>
>>>>> Thanks
>>>>> Ben
>>>>>
>>>>> On 03.02.2014 10:25, Scott Long wrote:
>>>>>> Did you set it to 0 via the sysctl?  You might need to wait for
>>>>>> several minutes if you set it after setting up the links.
>>>>>>
>>>>>> Also, the message that you’re seeing is from your machine
>>>>>> transmitting PDU packets.  Are you seeing any "lacpdu receive”
>>>>>> messages on the console?
>>>>>>
>>>>>> Thanks,
>>>>>> Scott
>>>>>>
>>>>>> On Feb 3, 2014, at 2:10 AM, Ben <mailinglists at niessen.ch> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I set strict mode to 0 but no use. I do receive PDU messages.
>>>>>>>
>>>>>>> igb0: lacpdu transmit
>>>>>>> actor=(...)
>>>>>>> actor.state=4d<ACTIVITY,AGGREGATION,SYNC,DEFAULTED>
>>>>>>> partner=(...)
>>>>>>> partner.state=0
>>>>>>> maxdelay=0
>>>>>>>
>>>>>>> Thanks
>>>>>>> Ben
>>>>>>>
>>>>>>> On 03.02.2014 10:03, Scott Long wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Unfortunately, you can’t control the strict mode globally.  My
>>>>>>>> apologies for this mess, I’ll make sure that it’s fixed for
>>>>>>>> FreeBSD 10.1. If the sysctl doesn’t help then maybe consider
>>>>>>>> compiling a custom kernel with it defaulted to 0.  You’ll need
>>>>>>>> to open /sys/net/ieee802ad_lacp.c and look for the function
>>>>>>>> lacp_attach().  You’ll see the strict_mode assign underneath
>>>>>>>> that.  I’ll also send you a patch in a few minutes.  Until then,
>>>>>>>> try enabling net.link.lagg.lacp.debug=1 and see if you’re
>>>>>>>> receiving heartbeat PDU’s from your switch.
>>>>>>>>
>>>>>>>> Scott
>>>>>>>>
>>>>>>>> On Feb 3, 2014, at 1:40 AM, Ben <mailinglists at niessen.ch> wrote:
>>>>>>>>
>>>>>>>>> Hi Scott,
>>>>>>>>>
>>>>>>>>> I had tried to set it in /etc/sysctl.conf but seems it didnt
>>>>>>>>> work. But will I try again and report back.
>>>>>>>>>
>>>>>>>>> The settings of the switch have not been changed and are set to
>>>>>>>>> LACP. It worked before so I guess the switch should not be the
>>>>>>>>> problem. Maybe some incompatibility between FreeBSD +
>>>>>>>>> igb-driver + switch (Juniper EX3300-48T).
>>>>>>>>>
>>>>>>>>> I will update you after setting the sysctl setting. It seems to
>>>>>>>>> be "dynamic", I guess 0 reflects the index of LACP lagg
>>>>>>>>> devices. Can I switch off the strict mode globally in
>>>>>>>>> /etc/sysctl.conf?
>>>>>>>>>
>>>>>>>>> Thanks for your help.
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>> Ben
>>>>>>>>>
>>>>>>>>> On 03.02.2014 09:31, Scott Long wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> You’re probably running into the consequences of r253687. 
>>>>>>>>>> Check to see the value of ‘sysctl
>>>>>>>>>> net.link.lagg.0.lacp.lacp_strict_mode’. If it’s ‘1’ then set
>>>>>>>>>> it to 0.  My original intention was for this to default to 0,
>>>>>>>>>> but apparently that didn’t happen.  However, the fact that
>>>>>>>>>> strict mode doesn’t seem to work at all for you might hint
>>>>>>>>>> that your switch either isn’t configured correctly for LACP,
>>>>>>>>>> or doesn’t actually support LACP at all.  You might want to
>>>>>>>>>> investigate that.
>>>>>>>>>>
>>>>>>>>>> Scott
>>>>>>>>>>
>>>>>>>>>> On Feb 3, 2014, at 1:17 AM, Ben <mailinglists at niessen.ch> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I upgraded from FreeBSD 9.2-RELEASE to 10.0-RELEASE. FreeBSD
>>>>>>>>>>> 9.2 was configured to use LACP with two igb devices.
>>>>>>>>>>>
>>>>>>>>>>> Now it stopped working after the upgrade.
>>>>>>>>>>>
>>>>>>>>>>> This is a screenshot of ifconfig -a after the upgrade to
>>>>>>>>>>> FreeBSD 10..0-RELEASE:
>>>>>>>>>>> http://tinypic.com/view.php?pic=28jvgpw&s=5#.Uu9PXT1dVPM
>>>>>>>>>>>
>>>>>>>>>>> A PR is currently open:
>>>>>>>>>>> http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/185967
>>>>>>>>>>>
>>>>>>>>>>> It is set to low, but I would like somebody to have a look
>>>>>>>>>>> into it as it obviously has a great influence on our
>>>>>>>>>>> infrastructure. The only way to "solve" it is currently
>>>>>>>>>>> switching back to FreeBSD 9.2.
>>>>>>>>>>>
>>>>>>>>>>> The suggested fix "use failover" seems not to work.
>>>>>>>>>>>
>>>>>>>>>>> Thank you for your help.
>>>>>>>>>>>
>>>>>>>>>>> Best regards
>>>>>>>>>>> Ben
>>>>>>>>>>> _______________________________________________

--
Koobs


More information about the freebsd-net mailing list