RPI 4B on UEFI: xhci0 disconnects under high load

Fri Sep 25 10:39:18 UTC 2020

Could the failure of the ACPI patch to work be related to the pre-September 2020 dtbs reporting 4 GB available for pci DMA (as you reported today in another thread)?

On Fri, Sep 25, 2020 at 09:49, Mark Millard via freebsd-arm <freebsd-arm at freebsd.org> wrote:

> On 2020-Sep-25, at 00:58, Robert Clausecker <fuz at fuz.su> wrote:
>
>> Hi Mark,
>>
>> Thanks for your quick response! It appears that I had missed that this
>> changeset was required. Let me update to the most recent revision and try
>> again.
>
> My context is based on head -r365932 . In github terms, at:
>
> https://github.com/freebsd/freebsd/commit/173c619
>
> After that I do not know if anything new interferes.
>
>> Yes, I have been using the github mirror. Are there any other
>> patches I should consider applying?
>
> Most of my patches are for powerpc64 and powerpc (old PowerMacs).
> The only aarch64-related patch I have is for:
>
> /usr/src/sys/dev/acpica/acpi.c
>
> from https://reviews.freebsd.org/D25219 . But you have
> reported having this one in place. As I remember it is
> required to have rpi4-uefi-devel work at all. But it
> still requires that the uefi be configured to limit the
> RAM to 3072 MiBytes if you want reliable behavior for
> xhci use: FreeBSD does not correctly respect the DMA
> limitations for xhci use for ACPI based booting.
>
> (I have a type of test that fails without the 3072 MiByte
> limitation imposed.)
>
> You have reported having other patches in place that I
> do not have. I do not know about the status of those.
>
>> Yours,
>> Robert Clausecker
>>
>> On Thu, Sep 24, 2020 at 06:19:36PM -0700, Mark Millard wrote:
>>>
>>>
>>> On 2020-Sep-24, at 15:47, Robert Clausecker <fuz at fuz.su> wrote:
>>>
>>>> Good evening!
>>>>
>>>> I have set up a FreeBSD system on a Raspberry Pi 4B as described
>>>> in bug #249520 (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=249520).
>>>> After setting up the USB drive on a USB 2.0 port, the system boots.
>>>> However, when the system is under high I/O load (I tested this by
>>>> compiling a Go toolchain), the USB controller eventually hangs and
>>>> causes the system to effectively crash:
>>>>
>>>> ---
>>>> xhci_interrupt: host system error
>>>> xhci0: Resetting controller
>>>
>>> Looks like prior history to the above would be
>>> appropriate. (The later messages likely are
>>> consequences of the above.)
>>>
>>> Also: ed6978a9a70 in github is from:
>>>
>>> QUOTE
>>> Author: ian
>>> Date: Mon Sep 14 17:33:28 2020
>>> New Revision: 365729
>>> URL:
>>> https://svnweb.freebsd.org/changeset/base/365729
>>>
>>> Log:
>>> Add product ID strings for a couple Microchip usb hubs. Also, update the
>>> vendor ID string to say just "Microchip Technology" -- the buyout of
>>> Standard Microsystems happened in 2012 and the SMC/SMSC names are pretty
>>> much retired at this point.
>>> END QUOTE
>>>
>>>
>>> but there is a more recent check-in required to
>>> avoid at least one way of getting "Resetting controller"
>>> for -mcpu=cortex-a72 :
>>>
>>> QUOTE
>>> Author: hselasky
>>> Date: Sat Sep 19 22:37:45 2020
>>> New Revision: 365918
>>> URL:
>>> https://svnweb.freebsd.org/changeset/base/365918
>>>
>>> Log:
>>> Fix for use of the XHCI driver on Cortex-A72 by adding a missing cache
>>> flush operation before writing to the XHCI_ERSTBA_LO/HI register(s).
>>> END QUOTE
>>>
>>> [I do suggest that you report which git repository that you
>>> are referencing since there are multiple ones right now that
>>> have differing hashes. I guessed github from "(master)",
>>> figuring that the cgit-beta.freebsd.org one would have
>>> "(main)".]
>>>
>>>> uhub1: at usbus0, port 1, addr 1 (disconnected)
>>>> ugen0.2: <vendor 0x2109 USB2.0 Hub> at usbus0 (disconnected)
>>>> uhub2: at uhub1, port 1, addr 1 (disconnected)
>>>> ugen0.3: <ASIX Elec. Corp. AX88x72A> at usbus0 (disconnected)
>>>> axe0: at uhub2, port 2, addr 2 (disconnected)
>>>> ukphy0: detached
>>>> miibus0: detached
>>>> axe0: detached
>>>> ugen0.4: <VLI Manufacture String VLI Product String> at usbus0 (disconnected)
>>>> umass0: at uhub2, port 4, addr 3 (disconnected)
>>>> (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 01 85 d9 0d 00 00 80 00
>>>> (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
>>>> (da0:umass-sim0:0:0:0): Retrying command, 3 more tries remain
>>>> da0 at umass-sim0 bus 0 scbus0 target 0 lun 0
>>>> da0: <WDC WDS2 40G2G0B-00EP UJ43> s/n ABCDEFA74566 detached
>>>> Solaris: WARNING: Pool 'tau' has encountered an uncorrectable I/O failure and has been suspended.
>>>>
>>>> Solaris: WARNING: Pool 'tau' has encountered an uncorrectable I/O failure and has been suspended.
>>>> ---
>>>
>>> The above messages I think are just consequences of earlier
>>> problems.
>>>
>>>> This is despite having applied D25219 and the D26493--D26496 series
>>>> of patches which were supposed to address this sort of issue. The same
>>>> issue does not seem to appear with an older kernel to which the
>>>> D26493--D26496 series of patches was not applied and which was not
>>>> compiled with -mcpu=cortex-a72. The older kernel identifies itself as
>>>>
>>>> FreeBSD 13.0-CURRENT #2 ed6978a9a70-c271559(master)-dirty
>>>>
>>>> It's the one I described in my earlier mails to this list. So it seems
>>>> that in this case, pulling in patches meant to fix a bug seem to have
>>>> introduced in this first place. Any idea what could have happened?
>>>
>>> I strongly suggest using a FreeBSD vintage that includes
>>> the corrected XHCI driver.
>>
>
> ===
> Mark Millard
> marklmi at yahoo.com
> ( dsl-only.net went
> away in early 2018-Mar)
>
> _______________________________________________
> freebsd-arm at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-arm
> To unsubscribe, send any mail to "freebsd-arm-unsubscribe at freebsd.org"