Re: RES: TP-LINK USB no carrier after speed test

From: Hans Petter Selasky <hps_at_selasky.org>
Date: Tue, 27 Sep 2022 18:17:54 UTC
On 9/27/22 15:22, Hans Petter Selasky wrote:
> On 9/27/22 14:17, Ivan Quitschal wrote:
>>
>>
>> On Tue, 27 Sep 2022, Hans Petter Selasky wrote:
>>
>>> On 9/27/22 02:24, Alexander Motin wrote:
>>>> On 26.09.2022 17:29, Hans Petter Selasky wrote:
>>>>> I've got a supposedly "broken" if_ure dongle from Alexander, but 
>>>>> I'm unable to reproduce the if_ure hang on two different pieces of 
>>>>> XHCI hardware, Intel based and AMD based, which I've got.
>>>>>
>>>>> This leads me to believe there is a bug in the XHCI driver or 
>>>>> hardware on your system.
>>>>>
>>>>> Can you share the pciconfig -lv output for your XHCI controllers?
>>>>
>>>> I have two laptops of different generations reproducing this 
>>>> problem, but both are having Thunderbolt on the USB-C ports:
>>>>
>>>> This is one (7th Gen Core i7):
>>>>
>>>> xhci1@pci0:56:0:0:      class=0x0c0330 rev=0x02 hdr=0x00 
>>>> vendor=0x8086 device=0x15d4 subvendor=0x2222 subdevice=0x1111
>>>>      vendor     = 'Intel Corporation'
>>>>      device     = 'JHL6540 Thunderbolt 3 USB Controller (C step) 
>>>> [Alpine Ridge 4C 2016]'
>>>>      class      = serial bus
>>>>      subclass   = USB
>>>>      bar   [10] = type Memory, range 32, base 0xc3f00000, size 
>>>> 65536, enabled
>>>>      cap 01[80] = powerspec 3  supports D0 D1 D2 D3  current D0
>>>>      cap 05[88] = MSI supports 8 messages, 64 bit enabled with 1 
>>>> message
>>>>      cap 10[c0] = PCI-Express 2 endpoint max data 128(128) RO NS
>>>>                   max read 512
>>>>                   link x4(x4) speed 2.5(2.5) ASPM disabled(L0s/L1) 
>>>> ClockPM disabled
>>>>      ecap 0003[100] = Serial 1 20ff910876f10c00
>>>>      ecap 0001[200] = AER 1 0 fatal 0 non-fatal 1 corrected
>>>>      ecap 0002[300] = VC 1 max VC0
>>>>      ecap 0004[400] = Power Budgeting 1
>>>>      ecap 000b[500] = Vendor [1] ID 1234 Rev 1 Length 216
>>>>      ecap 0018[600] = LTR 1
>>>>      ecap 0019[700] = PCIe Sec 1 lane errors 0
>>>>
>>>> This is another (11th Gen Core i7);
>>>>
>>>> xhci0@pci0:0:13:0:      class=0x0c0330 rev=0x01 hdr=0x00 
>>>> vendor=0x8086 device=0x9a13 subvendor=0x1028 subdevice=0x0991
>>>>      vendor     = 'Intel Corporation'
>>>>      device     = 'Tiger Lake-LP Thunderbolt 4 USB Controller'
>>>>      class      = serial bus
>>>>      subclass   = USB
>>>>      bar   [10] = type Memory, range 64, base 0x60552c0000, size 
>>>> 65536, enabled
>>>>      cap 01[70] = powerspec 2  supports D0 D3  current D0
>>>>      cap 05[80] = MSI supports 8 messages, 64 bit enabled with 1 
>>>> message
>>>>      cap 09[90] = vendor (length 20) Intel cap 15 version 0
>>>>      cap 09[b0] = vendor (length 0) Intel cap 0 version 1
>>>>
>>>> Does the system you also has Thunderbolt chip, or you use native 
>>>> Intel chipet's XHCI?
>>>>
>>>>> Also, when running the stress test and you see the traffic stops, 
>>>>> what happens if you run this command as root on the ugen which the 
>>>>> if_ure belongs to:
>>>>>
>>>>> usbconfig -d ugenX.Y dump_string 0
>>>>>
>>>>> Does the traffic resume?
>>>>
>>>> Nope. Out of 4 times when traffic stopped 2 times it reported <read 
>>>> error> and 2 times it completed successfully, but it neither case it 
>>>> recovered traffic.  Only reset recovered it.
>>>>
>>>
>>> Hi Alexander,
>>>
>>> Could you run "usbdump -d X.Y" at the same time to capture all the 
>>> errors?
>>>
>>> Looking especially for USB_ERR_TIMEOUT .
>>>
>>> I have this:
>>>
>>> xhci0@pci0:3:0:3:    class=0x0c0330 rev=0x00 hdr=0x00 vendor=0x1022 
>>> device=0x15e0 subvendor=0x1849 subdevice=0xffff
>>>    vendor     = 'Advanced Micro Devices, Inc. [AMD]'
>>>    device     = 'Raven USB 3.1'
>>>    class      = serial bus
>>>    subclass   = USB
>>>
>>> xhci0@pci0:0:20:0:    class=0x0c0330 rev=0x21 hdr=0x00 vendor=0x8086 
>>> device=0x9d2f subvendor=0x8086 subdevice=0x9d2f
>>>    vendor     = 'Intel Corporation'
>>>    device     = 'Sunrise Point-LP USB 3.0 xHCI Controller'
>>>    class      = serial bus
>>>    subclass   = USB
>>>
>>> --HPS
>>>
>>
>>
>> hi Hans
>>
>> i think i got some good logs for you
>>
>> before the problem i ran this:
>>
>> ugen0.10: <TP-Link USB 10/100/1000 LAN> at usbus0, cfg=0 md=HOST 
>> spd=SUPER (5.0Gbps) pwr=ON (72mA)
>>
>> # usbconfig -d ugen0.10 >> before
>> # usbconfig -d ugen0.10 dump_all_desc >> before
>> # usbconfig -d ugen0.10 dump_stats >> before_status
>>
>> the after the problem happened i ran
>>
>> # usbconfig -d ugen0.10 >> after
>> # usbconfig -d ugen0.10 dump_all_desc >> after
>> # usbconfig -d ugen0.10 dump_stats >> after_status
>>
>>
>> just by looking i already see some problems comparing both
>>
>> for example
>>
>> before the problem we have:
>>
>> ----------------------
>> ugen0.10: <TP-Link USB 10/100/1000 LAN> at usbus0, cfg=0 md=HOST 
>> spd=SUPER (5.0Gbps) pwr=ON (72mA)
>> ugen0.10: <TP-Link USB 10/100/1000 LAN> at usbus0, cfg=0 md=HOST 
>> spd=SUPER (5.0Gbps) pwr=ON (72mA)
>>
>>    bLength = 0x0012
>>    bDescriptorType = 0x0001
>>    bcdUSB = 0x0300
>>    bDeviceClass = 0x0000  <Probed by interface class>
>>    bDeviceSubClass = 0x0000
>>    bDeviceProtocol = 0x0000
>>    bMaxPacketSize0 = 0x0009
>>    idVendor = 0x2357
>>    idProduct = 0x0601
>>    bcdDevice = 0x3000
>> ****
>>    iManufacturer = 0x0001  <TP-Link>
>>    iProduct = 0x0002  <USB 10/100/1000 LAN>
>>    iSerialNumber = 0x0006  <000001>
>>    bNumConfigurations = 0x0002
>>
>> ------------------------
>>
>> after the problem
>>
>> --------------------------
>> ugen0.10: <TP-Link USB 10/100/1000 LAN> at usbus0, cfg=0 md=HOST 
>> spd=SUPER (5.0Gbps) pwr=ON (72mA)
>> ugen0.10: <TP-Link USB 10/100/1000 LAN> at usbus0, cfg=0 md=HOST 
>> spd=SUPER (5.0Gbps) pwr=ON (72mA)
>>
>>    bLength = 0x0012
>>    bDescriptorType = 0x0001
>>    bcdUSB = 0x0300
>>    bDeviceClass = 0x0000  <Probed by interface class>
>>    bDeviceSubClass = 0x0000
>>    bDeviceProtocol = 0x0000
>>    bMaxPacketSize0 = 0x0009
>>    idVendor = 0x2357
>>    idProduct = 0x0601
>>    bcdDevice = 0x3000
>> ****
>> iManufacturer = 0x0001  <retrieving string failed>
>>    iProduct = 0x0002  <retrieving string failed>
>>    iSerialNumber = 0x0006  <retrieving string failed>
>>    bNumConfigurations = 0x0002
>>
>>   Configuration index 0 --------------------------
>>
>>
>> the log in ttyv0 was this:
>>
>> ure0: at uhub0, port 14, addr 9 (disconnected)
>> Sep 27 08:55:58 tzk-inspiron dhclient[1201]: receive_packet failed on 
>> ue0: Device not configured
>> Sep 27 08:55:58 tzk-inspiron dhclient[1201]: ioctl(SIOCGIFFLAGS) on 
>> ue0: Operation not permitted
>> Sep 27 08:55:58 tzk-inspiron dhclient[1201]: Interface ue0 no longer 
>> appears valid.
>> Sep 27 08:55:58 tzk-inspiron dhclient[1201]: No live interfaces to 
>> poll on - exiting.
>> Sep 27 08:55:58 tzk-inspiron dhclient[1201]: exiting.
>> Sep 27 08:55:58 tzk-inspiron dhclient[1201]: connection closed
>> Sep 27 08:55:58 tzk-inspiron dhclient[1201]: exiting.
>> rgephy0: detached
>> miibus0: detached
>> ure0: detached
>>
>>
>> difference between before_status and after_status
>>
>> before_status:
>>
>> ugen0.10: <TP-Link USB 10/100/1000 LAN> at usbus0, cfg=0 md=HOST 
>> spd=SUPER (5.0Gbps) pwr=ON (72mA)
>>
>> {
>>      UE_CONTROL_OK       : 2389
>>      UE_ISOCHRONOUS_OK   : 0
>>      UE_BULK_OK          : 803
>>      UE_INTERRUPT_OK     : 0
>>      UE_CONTROL_FAIL     : 0
>>      UE_ISOCHRONOUS_FAIL : 0
>>      UE_BULK_FAIL        : 0
>>      UE_INTERRUPT_FAIL   : 0
>> }
>>
>>
>> after_status:
>>
>> ugen0.10: <TP-Link USB 10/100/1000 LAN> at usbus0, cfg=0 md=HOST 
>> spd=SUPER (5.0Gbps) pwr=ON (72mA)
>>
>> {
>>      UE_CONTROL_OK       : 4275
>>      UE_ISOCHRONOUS_OK   : 0
>>      UE_BULK_OK          : 1126702
>>      UE_INTERRUPT_OK     : 0
>>      UE_CONTROL_FAIL     : 326
>>      UE_ISOCHRONOUS_FAIL : 0
>>      UE_BULK_FAIL        : 42
>>      UE_INTERRUPT_FAIL   : 0
>> }
>>
>>
>>
>> i hope that helps
>>
>>
>> all log files are attached
>>
>> thanks
>>
>> --tzk
> 
> 
> One more test request:
> 
> Make sure you have a kernel with "options USB_DEBUG".
> 
> Then after "iperf" says 0bits/s, then you run:
> 
> sysctl hw.usb.xhci.debug=29
> 
> Then run those usbconfig commands.
> 
> Then then:
> 
> sysctl hw.usb.xhci.debug=0
> 
> And collect the logs from /var/log/messages .
> 
> Like said earlier, I wonder if the fault is in the XHCI controller 
> itself or something that has to do with thunderbolt, because on my 
> machine there is no indication of a if_ure device failure. The only way 
> to know for sure is to buy a USB 3.0 wire analyzer, like the beagle one, 
> but I'm unsure if they support USB-C .
> 
> Thank you!
> 
> --HPS
> 

FYI: There is some experimental thunderbolt support at:

https://github.com/hselasky/usb4

But I'm not sure if it supports the hardware you've got.

--HPS