RPI 4B on UEFI: xhci0 disconnects under high load

Fri Sep 25 01:19:48 UTC 2020

On 2020-Sep-24, at 15:47, Robert Clausecker <fuz at fuz.su> wrote:

> Good evening!
> 
> I have set up a FreeBSD system on a Raspberry Pi 4B as described
> in bug #249520 (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=249520).
> After setting up the USB drive on a USB 2.0 port, the system boots.
> However, when the system is under high I/O load (I tested this by
> compiling a Go toolchain), the USB controller eventually hangs and
> causes the system to effectively crash:
> 
> ---
> xhci_interrupt: host system error
> xhci0: Resetting controller

Looks like prior history to the above would be
appropriate. (The later messages likely are
consequences of the above.)

Also: ed6978a9a70 in github is from:

QUOTE
Author: ian
Date: Mon Sep 14 17:33:28 2020
New Revision: 365729
URL: 
https://svnweb.freebsd.org/changeset/base/365729

Log:
  Add product ID strings for a couple Microchip usb hubs.  Also, update the
  vendor ID string to say just "Microchip Technology" -- the buyout of
  Standard Microsystems happened in 2012 and the SMC/SMSC names are pretty
  much retired at this point.
END QUOTE

but there is a more recent check-in required to
avoid at least one way of getting "Resetting controller"
for -mcpu=cortex-a72 :

QUOTE
Author: hselasky
Date: Sat Sep 19 22:37:45 2020
New Revision: 365918
URL: 
https://svnweb.freebsd.org/changeset/base/365918

Log:
  Fix for use of the XHCI driver on Cortex-A72 by adding a missing cache
  flush operation before writing to the XHCI_ERSTBA_LO/HI register(s).
END QUOTE

[I do suggest that you report which git repository that you
are referencing since there are multiple ones right now that
have differing hashes. I guessed github from "(master)",
figuring that the cgit-beta.freebsd.org one would have
"(main)".]

> uhub1: at usbus0, port 1, addr 1 (disconnected)
> ugen0.2: <vendor 0x2109 USB2.0 Hub> at usbus0 (disconnected)
> uhub2: at uhub1, port 1, addr 1 (disconnected)
> ugen0.3: <ASIX Elec. Corp. AX88x72A> at usbus0 (disconnected)
> axe0: at uhub2, port 2, addr 2 (disconnected)
> ukphy0: detached
> miibus0: detached
> axe0: detached
> ugen0.4: <VLI Manufacture String VLI Product String> at usbus0 (disconnected)
> umass0: at uhub2, port 4, addr 3 (disconnected)
> (da0:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 01 85 d9 0d 00 00 80 00 
> (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> (da0:umass-sim0:0:0:0): Retrying command, 3 more tries remain
> da0 at umass-sim0 bus 0 scbus0 target 0 lun 0
> da0: <WDC WDS2 40G2G0B-00EP UJ43>  s/n ABCDEFA74566 detached
> Solaris: WARNING: Pool 'tau' has encountered an uncorrectable I/O failure and has been suspended.
> 
> Solaris: WARNING: Pool 'tau' has encountered an uncorrectable I/O failure and has been suspended.
> ---

The above messages I think are just consequences of earlier
problems.

> This is despite having applied D25219 and the D26493--D26496 series
> of patches which were supposed to address this sort of issue.  The same
> issue does not seem to appear with an older kernel to which the
> D26493--D26496 series of patches was not applied and which was not
> compiled with -mcpu=cortex-a72.  The older kernel identifies itself as
> 
>    FreeBSD 13.0-CURRENT #2 ed6978a9a70-c271559(master)-dirty
> 
> It's the one I described in my earlier mails to this list.  So it seems
> that in this case, pulling in patches meant to fix a bug seem to have
> introduced in this first place.  Any idea what could have happened?

I strongly suggest using a FreeBSD vintage that includes
the corrected XHCI driver.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)