svn commit: r341586 - head/sys/dev/mlx5/mlx5_en

John Baldwin jhb at FreeBSD.org
Wed May 1 16:45:58 UTC 2019


On 5/1/19 1:05 AM, Slava Shwartsman wrote:
> 
> 
> On 01-May-19 10:28, Slava Shwartsman wrote:
>>
>>
>> On 01-May-19 10:09, Slava Shwartsman wrote:
>>>
>>>
>>> On 30-Apr-19 00:14, John Baldwin wrote:
>>>> On 4/25/19 12:10 AM, Slava Shwartsman wrote:
>>>>>
>>>>>
>>>>> On 17-Apr-19 00:28, John Baldwin wrote:
>>>>>> On 4/16/19 8:32 AM, Hans Petter Selasky wrote:
>>>>>>> On 4/16/19 4:39 PM, Andrey V. Elsukov wrote:
>>>>>>>> On 05.12.2018 17:25, Slava Shwartsman wrote:
>>>>>>>>> Author: slavash
>>>>>>>>> Date: Wed Dec  5 14:25:03 2018
>>>>>>>>> New Revision: 341586
>>>>>>>>> URL: https://svnweb.freebsd.org/changeset/base/341586
>>>>>>>>>
>>>>>>>>> Log:
>>>>>>>>>      mlx5en: Implement backpressure indication.
>>>>>>>>>      The backpressure indication is implemented using an 
>>>>>>>>> unlimited rate type of
>>>>>>>>>      mbuf send tag. When the upper layers typically the socket 
>>>>>>>>> layer has obtained such
>>>>>>>>>      a tag, it can then query the destination driver queue for 
>>>>>>>>> the current
>>>>>>>>>      amount of space available in the send queue.
>>>>>>>>>      A single mbuf send tag may be referenced multiple times and 
>>>>>>>>> a refcount has been added
>>>>>>>>>      to the mlx5e_priv structure to track its usage. Because the 
>>>>>>>>> send tag resides
>>>>>>>>>      in the mlx5e_channel structure, there is no need to wait 
>>>>>>>>> for refcounts to reach
>>>>>>>>>      zero until the mlx4en(4) driver is detached. The channels 
>>>>>>>>> structure is persistant
>>>>>>>>>      during the lifetime of the mlx5en(4) driver it belongs to 
>>>>>>>>> and can so be accessed
>>>>>>>>>      without any need of synchronization.
>>>>>>>>>      The mlx5e_snd_tag structure was extended to contain a type 
>>>>>>>>> field, because there are now
>>>>>>>>>      two different tag types which end up in the driver which 
>>>>>>>>> need to be distinguished.
>>>>>>>>>      Submitted by:   hselasky@
>>>>>>>>>      Approved by:    hselasky (mentor)
>>>>>>>>>      MFC after:      1 week
>>>>>>>>>      Sponsored by:   Mellanox Technologies
>>>>>>>>> @@ -587,27 +609,33 @@ mlx5e_xmit(struct ifnet *ifp, struct mbuf 
>>>>>>>>> *mb)
>>>>>>>>>         struct mlx5e_sq *sq;
>>>>>>>>>         int ret;
>>>>>>>>> -    sq = mlx5e_select_queue(ifp, mb);
>>>>>>>>> -    if (unlikely(sq == NULL)) {
>>>>>>>>> -#ifdef RATELIMIT
>>>>>>>>> -        /* Check for route change */
>>>>>>>>> -        if (mb->m_pkthdr.snd_tag != NULL &&
>>>>>>>>> -            mb->m_pkthdr.snd_tag->ifp != ifp) {
>>>>>>>>> +    if (mb->m_pkthdr.snd_tag != NULL) {
>>>>>>>>> +        sq = mlx5e_select_queue_by_send_tag(ifp, mb);
>>>>>>>>> +        if (unlikely(sq == NULL)) {
>>>>>>>>> +            /* Check for route change */
>>>>>>>>> +            if (mb->m_pkthdr.snd_tag->ifp != ifp) {
>>>>>>>>> +                /* Free mbuf */
>>>>>>>>> +                m_freem(mb);
>>>>>>>>> +
>>>>>>>>> +                /*
>>>>>>>>> +                 * Tell upper layers about route
>>>>>>>>> +                 * change and to re-transmit this
>>>>>>>>> +                 * packet:
>>>>>>>>> +                 */
>>>>>>>>> +                return (EAGAIN);
>>>>>>>>> +            }
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I just discovered something strange and found that this commit is 
>>>>>>>> the
>>>>>>>> cause.
>>>>>>>> The test system has mlx5en 100G interface. It has two vlans: 
>>>>>>>> vlan500 and
>>>>>>>> vlan100.
>>>>>>>> Via vlan500 it receives some packets flows. Then it routes these 
>>>>>>>> packets
>>>>>>>> into vlan100.
>>>>>>>> But packets are dropped in mlx5e_xmit() with EAGAIN error code.
>>>>>>>>
>>>>>>>> # dtrace -n 'fbt::ip6_output:return {printf("%d", arg1);}'
>>>>>>>> dtrace: description 'fbt::ip6_output:return ' matched 1 probe
>>>>>>>> CPU     ID                    FUNCTION:NAME
>>>>>>>>     23  54338                ip6_output:return 35
>>>>>>>>     16  54338                ip6_output:return 35
>>>>>>>>     21  54338                ip6_output:return 35
>>>>>>>>     22  54338                ip6_output:return 35
>>>>>>>>     24  54338                ip6_output:return 35
>>>>>>>>     23  54338                ip6_output:return 35
>>>>>>>>     14  54338                ip6_output:return 35
>>>>>>>> ^C
>>>>>>>>
>>>>>>>> # dtrace -n 'fbt::mlx5e_xmit:return {printf("%d", arg1);}'
>>>>>>>> dtrace: description 'fbt::mlx5e_xmit:return ' matched 1 probe
>>>>>>>> CPU     ID                    FUNCTION:NAME
>>>>>>>>     16  69030                mlx5e_xmit:return 35
>>>>>>>>     23  69030                mlx5e_xmit:return 35
>>>>>>>>     26  69030                mlx5e_xmit:return 35
>>>>>>>>     25  69030                mlx5e_xmit:return 35
>>>>>>>>     24  69030                mlx5e_xmit:return 35
>>>>>>>>     21  69030                mlx5e_xmit:return 35
>>>>>>>>     26  69030                mlx5e_xmit:return 35
>>>>>>>> ^C
>>>>>>>>
>>>>>>>> The kernel config is GENERIC.
>>>>>>>> 13.0-CURRENT #9 r345758+82f3d57(svn_head)-dirty
>>>>>>>>
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> This might be a case where rcvif in the mbuf's pktheader is not 
>>>>>>> cleared
>>>>>>> before the packet is fed back on the wire.
>>>>>>>
>>>>>>> John Baldwin is working on the send tags implementation, to eliminate
>>>>>>> the EAGAIN handling in the network drivers.
>>>>>>
>>>>>> I will try to push this branch sooner then since it affects more 
>>>>>> than just
>>>>>> TLS.  Part of the change includes a new flag we can use to assert 
>>>>>> that we
>>>>> Thanks John!
>>>>>> aren't just getting a stale rcvif (though there are also now 
>>>>>> assertions in
>>>>>> ip_output that should catch this case I think).
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Hi Andrey,
>>>>>
>>>>> Yes, we were able to reproduce this issue in house. If you don't 
>>>>> mind, I
>>>>> prefer to wait for John's update - where he eliminates the EAGAIN
>>>>> handling in the network drivers.
>>>>
>>>> I have rebased the branch for this, but for now it will just panic 
>>>> sooner
>>>> I believe by tripping an assertion.  Can you grab the diff (or just 
>>>> the branch)
>>>> from the 'send_tags' branch at github/bsdjhb/freebsd and reproduce 
>>>> under a
>>>> kernel with INVARIANTS?  I think we will have to explicitly clear the 
>>>> 'rcvif'
>>>> pointer somewhere, but I want to see what the stack trace looks like 
>>>> so I can
>>>> think about the "right" place to clear it.
>>>>
>>>
>>> Hi John,
>>>
>>> I grabbed your branch (which doesn't build BTW due to libbe(3): Fix 
>>> mis-application of patch (SHLIBDIR) so I just reverted it).
>>>
>>> The kernel doesn't panic in this scenario - it just that the packets 
>>> are being dropped. So I added a kdb_backtrace right before the return 
>>> (EAGAIN) in mlx5e_xmit:
>>>
>>> KDB: stack backtrace:
>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
>>> 0xfffffe0000547d90
>>> mlx5e_xmit() at mlx5e_xmit+0x3d/frame 0xfffffe0000548160
>>> vlan_transmit() at vlan_transmit+0xdc/frame 0xfffffe00005481d0
>>> ether_output_frame() at ether_output_frame+0xa2/frame 0xfffffe0000548200
>>> ether_output() at ether_output+0x689/frame 0xfffffe00005482a0
>>> ip_output() at ip_output+0x13a4/frame 0xfffffe00005483f0
>>> ip_forward() at ip_forward+0x344/frame 0xfffffe00005484b0
>>> ip_input() at ip_input+0x7f5/frame 0xfffffe0000548560
>>> netisr_dispatch_src() at netisr_dispatch_src+0xa2/frame 
>>> 0xfffffe00005485d0
>>> ether_demux() at ether_demux+0x147/frame 0xfffffe0000548600
>>> ether_nh_input() at ether_nh_input+0x403/frame 0xfffffe0000548660
>>> netisr_dispatch_src() at netisr_dispatch_src+0xa2/frame 
>>> 0xfffffe00005486d0
>>> ether_input() at ether_input+0x73/frame 0xfffffe0000548700
>>> vlan_input() at vlan_input+0x1e7/frame 0xfffffe0000548750
>>> ether_demux() at ether_demux+0x12d/frame 0xfffffe0000548780
>>> ether_nh_input() at ether_nh_input+0x403/frame 0xfffffe00005487e0
>>> netisr_dispatch_src() at netisr_dispatch_src+0xa2/frame 
>>> 0xfffffe0000548850
>>> ether_input() at ether_input+0x73/frame 0xfffffe0000548880
>>> mlx5e_rx_cq_comp() at mlx5e_rx_cq_comp+0x8b4/frame 0xfffffe00005489a0
>>> mlx5_cq_completion() at mlx5_cq_completion+0x5e/frame 0xfffffe00005489d0
>>> mlx5_msix_handler() at mlx5_msix_handler+0x1ba/frame 0xfffffe0000548a10
>>> ithread_loop() at ithread_loop+0x187/frame 0xfffffe0000548a70
>>> fork_exit() at fork_exit+0x84/frame 0xfffffe0000548ab0
>>> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0000548ab0
>>> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
>>>
>>>
>>> Please ping me if you want me to try anything else.
>>>
>>>
>>> Slava
>>
>> My bad - tested with master. Re-testing now.
>>
>>
>> Slava
> 
> Got it now:
> 
> panic: Assertion m->m_pkthdr.snd_tag == NULL failed at 
> /usr/src/sys/netinet/ip_output.c:213
> cpuid = 0
> time = 1556697834
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
> 0xfffffe00005ac1f0
> vpanic() at vpanic+0x19d/frame 0xfffffe00005ac240
> panic() at panic+0x43/frame 0xfffffe00005ac2a0
> ip_output() at ip_output+0x159f/frame 0xfffffe00005ac3f0
> ip_forward() at ip_forward+0x38c/frame 0xfffffe00005ac4b0
> ip_input() at ip_input+0x7f5/frame 0xfffffe00005ac560
> netisr_dispatch_src() at netisr_dispatch_src+0xa2/frame 0xfffffe00005ac5d0
> ether_demux() at ether_demux+0x147/frame 0xfffffe00005ac600
> ether_nh_input() at ether_nh_input+0x403/frame 0xfffffe00005ac660
> netisr_dispatch_src() at netisr_dispatch_src+0xa2/frame 0xfffffe00005ac6d0
> ether_input() at ether_input+0x7d/frame 0xfffffe00005ac700
> vlan_input() at vlan_input+0x1e7/frame 0xfffffe00005ac750
> ether_demux() at ether_demux+0x12d/frame 0xfffffe00005ac780
> ether_nh_input() at ether_nh_input+0x403/frame 0xfffffe00005ac7e0
> netisr_dispatch_src() at netisr_dispatch_src+0xa2/frame 0xfffffe00005ac850
> ether_input() at ether_input+0x7d/frame 0xfffffe00005ac880
> mlx5e_rx_cq_comp() at mlx5e_rx_cq_comp+0x8b4/frame 0xfffffe00005ac9a0
> mlx5_cq_completion() at mlx5_cq_completion+0x5e/frame 0xfffffe00005ac9d0
> mlx5_msix_handler() at mlx5_msix_handler+0x1ba/frame 0xfffffe00005aca10
> ithread_loop() at ithread_loop+0x187/frame 0xfffffe00005aca70
> fork_exit() at fork_exit+0x84/frame 0xfffffe00005acab0
> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00005acab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> KDB: enter: panic
> [ thread pid 12 tid 100113 ]
> Stopped at      kdb_enter+0x3b: movq    $0,kdb_why
> 
> 
> I can keep the machine in this state for a while now if you want to take 
> a look at anything specific.

I pushed further changes to this branch (that are now in the review that
Hans noted) that I think should fix the panic.  You can either pull the
branch (I'll try to rebase it today) or from the review.  Please let me
know if this resolves the panic.  Thanks!

-- 
John Baldwin


More information about the svn-src-head mailing list