FYI: RPi4B "C0T" vs. "B0T" (Inbound from PCIe 3 GiByte limit for XHCI use) in various Operating System/Firmware combinations

From: Mark Millard <marklmi_at_yahoo.com>
Date: Tue, 24 Jan 2023 23:13:22 UTC
[To a notable extent here, "C0T" vs. "B0T" RPi4B handling is a
stand-in example for the more general issue of Device Tree
dma-ranges support for address space translations (or analogous
in UEFI/ACPI), not limited to RPi*'s at all.]

In exploring the status of handling "C0T" RPi4B's (and more) vs.
"B0T" RPi4B's for the XHCI related context that involved the
3 GiByte mistake in the PCIe wrapper logic for the "B0T" parts,
what I have found is that . . .

A) OpenBSD has had full Device Tree handling of the dma-ranges
   in place for some time. It supports "C0T" XHCI use in a manor
   that avoids needing the bouncing. For "B0T" it uses the
   dma-ranges and spans the 3 GiByte range as well, no longer
   using a hand coded, much smaller value somewhat under 1
   GiByte. OpenBSD normally uses U-Boot. Some (possibly
   outdated?) material indicates that RPi400's require(?)
   https://github.com/pftf/RPi4/ (EDK2) use instead, warning
   to probably use a specific known-working release, 1.21 as
   I remember. So see later EDK2 related notes for if that is
   the case. (I do not have access to a RPi400.)

B) Linux has had Device Tree handling of the dma-ranges in
   place for some time. It supports "C0T" XHCI use in a manor
   that avoids needing the bouncing. For "B0T" it uses the
   dma-ranges and spans the 3 GiByte range as well. Fedora
   uses U-Boot. RaspiOS64 (my abbreviation) does not (nor does
   it use EDK2). Both avoid bouncing for "C0T". (I've not
   looked at others in operation.)

C) The EDK2 implementation treats "C0T" parts in same the manor
   it handles the "B0T" parts. So bounce buffers end up being
   used for above the lower 3 GiBytes. Thus, even if something
   using EDK2 could support avoiding that bouncing for "C0T"
   parts, it would not avoid such: only a smaller-size aspect
   is published, using an identity for the address space
   translation. But see below for OpenBSD mixed with
   EDK2/DeviceTree.

D) For EDK2/DeviceTree, OpenBSD actually undoes some of what EDK2
   provides. This was for OPenBSD to support more modern RPi*
   firmware to support more modern devices according to comments
   in the code. (Some of what EDK2 does was to avoid earlier
   problems with OpenBSD and Linux when used with EDK2. So much
   for a clean firmware/OS partitioning.)

E) NetBSD is based on UEFI/ACPI and, so, is an example of (C)
   for https://github.com/pftf/RPi4/ . I'm not sure if NetBSD
   would "just work" if EDK2 started to allow "C0T" parts to
   have the PCIe space vs. CPU memory space address
   translations that are involved in avoiding bouncing. As
   stands, it might be untested because of EDK2 avoiding
   involving such space translations: only a smaller-size
   aspect is actually in use --instead of a address space
   translation with a full-size.

F) FreeBSD Device Tree handling hand codes a "B0T"-like structure
   and ignores dma-ranges completely. If I have understood the
   PCIe/XHCI related code correctly, there is no general
   infrastructure for supporting the generality of the PCIe
   space vs. CPU memory space addressing translations that
   dma-ranges support would need to be effective. More than
   just decoding the dma-ranges in order to initialize an
   existing infrastructure would be required as far as code
   changes go from what I can tell.

   So, if I understand right, any Device Tree with dma-ranges
   indicating non-identity address space translations currently
   needs to hand code a no-address-translation alternative unless
   they go to the effort to add the infrastructure. It is not
   just an RPi* issue.

   Also, for an identity address space translation but a
   smaller-size, this too is ignored. But it appears that the
   existing infrastructure could be initialized for this case,
   if I understood correctly. It did not appear to have to be
   handled in RPi* specific code.

Note: FreeBSD's UEFI/ACPI support for the RPI4B used to not handle
even just the smaller-size aspect when the translation was just an
identity translation: it was set up such that one page past the
end of the smaller-size ended up being allowed to be used. That
lead to corrupted files and such in deliberate testing. I've not
checked the status of this in recent times.

===
Mark Millard
marklmi at yahoo.com