A basis for a possible update to the pcie based xhci support? It survived huge-file duplicate-then-diff testing so far.
Mark Millard
marklmi at yahoo.com
Wed Oct 7 04:50:46 UTC 2020
On 2020-Oct-6, at 21:43, Mark Millard <marklmi at yahoo.com> wrote:
> Note: based on a head -r363932 context, not more recent.
>
> First off, a note about lowaddr values. What sysctl showed
> me were the likes of (prior to the changes that this note
> is about):
>
> . . .
> hw.busdma.zone2.lowaddr: 0x3c000fff
> . . .
> hw.busdma.zone1.lowaddr: 0x3fffffff
> . . .
> hw.busdma.zone0.lowaddr: 0xffffffff
> . . .
>
> So I've guessed that lowaddr should identify the
> end page of the possibly-use-it region, not the
> first do-not-use-it page.
> If wrong, at most it
> should avoid bouncing one page that it could
> avoid.
That was a wonderfully messed up sentence.
Trying again:
"If wrong, at most it would bounce one page that it
could avoid bouncing."
> But, if correct, it might bounce a page
> that it should instead of not doing so.
>
> Otherwise what I've done is put back some of your old
> bcm2838_pci.c code and removed the sc->sc_bus.dma_bits
> adjustment from the bcm2838_xhci.c code. Be warned
> that I copied the likes of REG_VALUE_4GB_WINDOW and
> REG_VALUE_4GB_CONFIG without understanding the values
> or encoding. (I'm not pcie knowledgable.)
>
> # svnlite diff /usr/src/sys/arm/broadcom/
> Index: /usr/src/sys/arm/broadcom/bcm2835/bcm2838_pci.c
> ===================================================================
> --- /usr/src/sys/arm/broadcom/bcm2835/bcm2838_pci.c (revision 365932)
> +++ /usr/src/sys/arm/broadcom/bcm2835/bcm2838_pci.c (working copy)
> @@ -91,27 +91,22 @@
> #define REG_EP_CONFIG_CHOICE 0x9000
> #define REG_EP_CONFIG_DATA 0x8000
>
> +#define REG_VALUE_4GB_WINDOW 0x11
> +#define REG_VALUE_4GB_CONFIG 0x88003000
> +
> /*
> * The system memory controller can address up to 16 GiB of physical memory
> * (although at time of writing the largest memory size available for purchase
> - * is 8 GiB). However, the system DMA controller is capable of accessing only a
> - * limited portion of the address space. Worse, the PCI-e controller has further
> - * constraints for DMA, and those limitations are not wholly clear to the
> - * author. NetBSD and Linux allow DMA on the lower 3 GiB of the physical memory,
> - * but experimentation shows DMA performed above 960 MiB results in data
> - * corruption with this driver. The limit of 960 MiB is taken from OpenBSD, but
> + * is 8 GiB). However, the system DMA controller in early enough boards is
> + * capable of accessing only a limited portion of the address space (3 GiByte).
> + * Worse, the PCI-e controller has further constraints for DMA, and those
> + * limitations are not wholly clear to the author. NetBSD and Linux allow
> + * DMA on the lower 3 GiB of the physical memory. OpenBSD used 960 MiByte but
> * apparently that value was chosen for satisfying a constraint of an unrelated
> * peripheral.
> - *
> - * Whatever the true maximum address, 960 MiB works.
> */
> -#define DMA_HIGH_LIMIT 0x3c000000
> -#define MAX_MEMORY_LOG2 0x21
> -#define REG_VALUE_DMA_WINDOW_LOW (MAX_MEMORY_LOG2 - 0xf)
> +#define DMA_HIGH_LIMIT ((bus_addr_t)0xc0000000u-1)
> #define REG_VALUE_DMA_WINDOW_HIGH 0x0
> -#define DMA_WINDOW_ENABLE 0x3000
> -#define REG_VALUE_DMA_WINDOW_CONFIG \
> - (((MAX_MEMORY_LOG2 - 0xf) << 0x1b) | DMA_WINDOW_ENABLE)
>
> #define REG_VALUE_MSI_CONFIG 0xffe06540
>
> @@ -645,9 +640,9 @@
> DMA_HIGH_LIMIT, /* lowaddr */
> BUS_SPACE_MAXADDR, /* highaddr */
> NULL, NULL, /* filter, filterarg */
> - DMA_HIGH_LIMIT, /* maxsize */
> + BUS_SPACE_MAXSIZE, /* maxsize */
> BUS_SPACE_UNRESTRICTED, /* nsegments */
> - DMA_HIGH_LIMIT, /* maxsegsize */
> + BUS_SPACE_MAXSIZE, /* maxsegsize */
> 0, /* flags */
> NULL, NULL, /* lockfunc, lockarg */
> &sc->dmat);
> @@ -674,9 +669,9 @@
> * Set PCI->CPU memory window. This encodes the inbound window showing
> * the system memory to the controller.
> */
> - bcm_pcib_set_reg(sc, REG_DMA_WINDOW_LOW, REG_VALUE_DMA_WINDOW_LOW);
> + bcm_pcib_set_reg(sc, REG_DMA_WINDOW_LOW, REG_VALUE_4GB_WINDOW);
> bcm_pcib_set_reg(sc, REG_DMA_WINDOW_HIGH, REG_VALUE_DMA_WINDOW_HIGH);
> - bcm_pcib_set_reg(sc, REG_DMA_CONFIG, REG_VALUE_DMA_WINDOW_CONFIG);
> + bcm_pcib_set_reg(sc, REG_DMA_CONFIG, REG_VALUE_4GB_CONFIG);
>
> bcm_pcib_set_reg(sc, REG_BRIDGE_GISB_WINDOW, 0);
> bcm_pcib_set_reg(sc, REG_DMA_WINDOW_1, 0);
> Index: /usr/src/sys/arm/broadcom/bcm2835/bcm2838_xhci.c
> ===================================================================
> --- /usr/src/sys/arm/broadcom/bcm2835/bcm2838_xhci.c (revision 365932)
> +++ /usr/src/sys/arm/broadcom/bcm2835/bcm2838_xhci.c (working copy)
> @@ -189,15 +189,7 @@
> bcm_xhci_install_xhci_firmware(dev);
>
> error = xhci_pci_attach(dev);
> - if (error)
> - return (error);
> -
> - /* 32 bit DMA is a limitation of the PCI-e controller, not the VL805. */
> - sc->sc_bus.dma_bits = 32;
> - if (bootverbose)
> - device_printf(dev, "note: switched to 32-bit DMA.\n");
> -
> - return (0);
> + return (error);
> }
>
> /*
>
> I've concluded from what I've seen in the code that lowaddr
> should be based on the pcie properties and should not worry
> about the maxsize and maxseg size figures being possibly
> smaller: that is a dma engine use worry, not a pci one. (Not
> that I could get that from the documentation that I quoted in
> the review.) Thus I put back the 2 BUS_SPACE_MAXSIZE uses.
>
> After the first huge-file duplicate-then-diff test sysctl
> reported lots of bounced transfers:
>
> # sysctl hw.busdma
> hw.busdma.zone1.alignment: 4096
> hw.busdma.zone1.lowaddr: 0x3fffffff
> hw.busdma.zone1.total_deferred: 0
> hw.busdma.zone1.total_bounced: 755770
> hw.busdma.zone1.active_bpages: 0
> hw.busdma.zone1.reserved_bpages: 0
> hw.busdma.zone1.free_bpages: 838
> hw.busdma.zone1.total_bpages: 838
> hw.busdma.zone0.alignment: 4096
> hw.busdma.zone0.lowaddr: 0xffffffff
> hw.busdma.zone0.total_deferred: 0
> hw.busdma.zone0.total_bounced: 0
> hw.busdma.zone0.active_bpages: 256
> hw.busdma.zone0.reserved_bpages: 0
> hw.busdma.zone0.free_bpages: 257
> hw.busdma.zone0.total_bpages: 513
> hw.busdma.total_bpages: 1351
>
> For the non-power-of-2 boundary (0xc0000000-1), it
> appears to use the next smaller power of 2 for the
> boundary (0x40000000-1), without having to explicitly
> code both types of values specially for the RPi4B.
> (Of course, it also avoids using 2 GiBytes to
> potentially avoid more bouncing.)
>
> I'll note that, prior to the change, there
> was after an example first test:
>
> hw.busdma.zone2.total_bounced: 1091942
>
> and 174 in zone 1. So the bounce count has
> decreased.
>
> I'll note that "total_bounced" need not be the
> a page count: it is incremented by 1 after
> the loop for a bounce, not inside the loop.
> Lots of pages of data were bounced.
>
> For reference (the test as of a gpu_mem_1024=32
> context):
>
> Physical memory chunk(s):
> 0x00000000002000 - 0x00000007ef0fff, 133099520 bytes (32495 pages)
> 0x00000007f0f000 - 0x00000034bfffff, 751767552 bytes (183537 pages)
> 0x00000036052000 - 0x0000003cb2efff, 112054272 bytes (27357 pages)
> 0x0000003cb36000 - 0x0000003cb36fff, 4096 bytes (1 pages)
> 0x0000003cb38000 - 0x0000003cb39fff, 8192 bytes (2 pages)
> 0x0000003cb3b000 - 0x0000003cb3cfff, 8192 bytes (2 pages)
> 0x0000003cb40000 - 0x0000003cb40fff, 4096 bytes (1 pages)
> 0x0000003cb42000 - 0x0000003cb43fff, 8192 bytes (2 pages)
> 0x0000003cb45000 - 0x0000003df4ffff, 21016576 bytes (5131 pages)
> 0x0000003df60000 - 0x0000003dffffff, 655360 bytes (160 pages)
> 0x00000040000000 - 0x000000fbffffff, 3154116608 bytes (770048 pages)
> 0x00000100000000 - 0x000001f372afff, 4084379648 bytes (997163 pages)
>
>
> FYI, before the huge-file duplicate-and-diff test:
>
> # sysctl hw.busdma
> hw.busdma.zone1.alignment: 4096
> hw.busdma.zone1.lowaddr: 0x3fffffff
> hw.busdma.zone1.total_deferred: 0
> hw.busdma.zone1.total_bounced: 866
> hw.busdma.zone1.active_bpages: 2
> hw.busdma.zone1.reserved_bpages: 0
> hw.busdma.zone1.free_bpages: 836
> hw.busdma.zone1.total_bpages: 838
> hw.busdma.zone0.alignment: 4096
> hw.busdma.zone0.lowaddr: 0xffffffff
> hw.busdma.zone0.total_deferred: 0
> hw.busdma.zone0.total_bounced: 0
> hw.busdma.zone0.active_bpages: 256
> hw.busdma.zone0.reserved_bpages: 0
> hw.busdma.zone0.free_bpages: 257
> hw.busdma.zone0.total_bpages: 513
> hw.busdma.total_bpages: 1351
>
> After the duplicate but before the diff:
>
> # sysctl hw.busdma
> hw.busdma.zone1.alignment: 4096
> hw.busdma.zone1.lowaddr: 0x3fffffff
> hw.busdma.zone1.total_deferred: 0
> hw.busdma.zone1.total_bounced: 513604
> hw.busdma.zone1.active_bpages: 8
> hw.busdma.zone1.reserved_bpages: 0
> hw.busdma.zone1.free_bpages: 830
> hw.busdma.zone1.total_bpages: 838
> hw.busdma.zone0.alignment: 4096
> hw.busdma.zone0.lowaddr: 0xffffffff
> hw.busdma.zone0.total_deferred: 0
> hw.busdma.zone0.total_bounced: 0
> hw.busdma.zone0.active_bpages: 256
> hw.busdma.zone0.reserved_bpages: 0
> hw.busdma.zone0.free_bpages: 257
> hw.busdma.zone0.total_bpages: 513
> hw.busdma.total_bpages: 1351
===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
More information about the freebsd-arm
mailing list