Re: Pinebook pro IOMMU enabled crashes

From: Jesper Schmitz Mouridsen <jsm_at_FreeBSD.org>
Date: Thu, 30 Sep 2021 16:28:09 UTC
On 23.09.2021 19.58, Jesper Schmitz Mouridsen wrote:
> Hi
> 
> I just rebuild a generic arm64 with only this change:
> 
> diff --git a/sys/arm64/conf/GENERIC b/sys/arm64/conf/GENERIC
> index c716183aae61..7a609db412ca 100644
> --- a/sys/arm64/conf/GENERIC
> +++ b/sys/arm64/conf/GENERIC
> @@ -19,7 +19,7 @@
> 
>   cpu            ARM64
>   ident          GENERIC
> -
> +options                IOMMU
>   include                "std.arm64"
>   include                "std.dev"
> 
> FreeBSD 14.0-CURRENT #6 main-n249584-fd69939e79a6-dirty
> 
> It does not happen without the nvme attached.
> 
> pcib0: <Rockchip PCIe controller> mem 
> 0xf8000000-0xf9ffffff,0xfd000000-0xfdffffff irq 6,7,8 on ofwbus0
> pci0: <OFW PCI bus> on pcib0
> pcib1: <PCI-PCI bridge> at device 0.0 on pci0
> pcib0: failed to reserve resource for pcib1
> pcib1: failed to allocate initial memory window: 0-0xfffff
> pci1: <PCI bus> on pcib1
> nvme0: <Generic NVMe Device> at device 0.0 on pci1
> Fatal data abort:
>    x0:                0
>    x1:             1000
>    x2:            10040
>    x3:             2000
>    x4:                1
>    x5: ffff00009a7e0168
>    x6: 1400000000000000
>    x7:   10000000000000
>    x8:             1168
>    x9:                1
>   x10:                0
>   x11: ffff000000e8c8c0
>   x12: ffff000000e8c840
>   x13:                1
>   x14:            10000
>   x15:                1
>   x16:            10000
>   x17: ffff000000e8c85c
>   x18: ffff000001064180
>   x19: ffff000001064248
>   x20:                0
>   x21: ffff00009a7df000
>   x22: ffffa0000102ea00
>   x23: ffffa00000bb6b80
>   x24: ffffa00001086200
>   x25: ffff000000aa8478
>   x26: ffffa00001086300
>   x27: ffff000000dda000
>   x28:                7
>   x29: ffff000001064190
>    sp: ffff000001064180
>    lr: ffff00000075f20c
>   elr: ffff00000078a654
> spsr:         200000c5
>   far:                0
>   esr:         96000004
> panic: vm_fault failed: ffff00000078a654 error 1
> cpuid = 0
> time = 1
> KDB: stack backtrace:
> db_trace_self() at db_trace_self
> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
> vpanic() at vpanic+0x184
> panic() at panic+0x44
> data_abort() at data_abort+0x23c
> handle_el1h_sync() at handle_el1h_sync+0x78
> --- exception, esr 0x96000004
> iommu_map_msi() at iommu_map_msi+0x20
> gicv3_iommu_init() at gicv3_iommu_init+0x4c
> intr_alloc_msix() at intr_alloc_msix+0x13c
> rk_pcie_alloc_msix() at rk_pcie_alloc_msix+0xfc
> pci_alloc_msix_method() at pci_alloc_msix_method+0x1a8
> nvme_pci_attach() at nvme_pci_attach+0x378
> device_attach() at device_attach+0x400
> device_probe_and_attach() at device_probe_and_attach+0x7c
> bus_generic_attach() at bus_generic_attach+0x18
> pci_attach() at pci_attach+0xe8
> device_attach() at device_attach+0x400
> device_probe_and_attach() at device_probe_and_attach+0x7c
> bus_generic_attach() at bus_generic_attach+0x18
> device_attach() at device_attach+0x400
> device_probe_and_attach() at device_probe_and_attach+0x7c
> bus_generic_attach() at bus_generic_attach+0x18
> pci_attach() at pci_attach+0xe8
> device_attach() at device_attach+0x400
> device_probe_and_attach() at device_probe_and_attach+0x7c
> bus_generic_attach() at bus_generic_attach+0x18
> rk_pcie_attach() at rk_pcie_attach+0x14cc
> device_attach() at device_attach+0x400
> device_probe_and_attach() at device_probe_and_attach+0x7c
> bus_generic_new_pass() at bus_generic_new_pass+0xf8
> bus_generic_new_pass() at bus_generic_new_pass+0xa8
> bus_generic_new_pass() at bus_generic_new_pass+0xa8
> bus_set_pass() at bus_set_pass+0x4c
> mi_startup() at mi_startup+0x12c
> virtdone() at virtdone+0x6c
> 
> /jsm
> 
> 
> On 23.09.2021 09.19, Emmanuel Vadot wrote:
>> On Sat, 18 Sep 2021 13:15:45 +0200
>> Jesper Schmitz Mouridsen <jsm@FreeBSD.org> wrote:
>>
>>> Hi
>>>
>>> Perhaps this one
>>> https://www.mail-archive.com/svn-src-head@freebsd.org/msg126068.html is
>>> giving troubles?
>>>
>>> main-n249225-f673cc5edac3-dirty
>>> nvme0: <Generic NVMe Device> at device 0.0 on pci1
>>> Fatal data abort:
>>>     x0:                0
>>>     x1:             1000
>>>     x2:            10040
>>>     x3:             2000
>>>     x4:                1
>>>     x5: ffff00009a7a0168
>>>     x6: 1d00000000000000
>>>     x7:   10000000000000
>>>     x8:             1168
>>>     x9:                1
>>>    x10:                0
>>>    x11: ffff000000f35140
>>>    x12: ffff000000f350c0
>>>    x13:                1
>>>    x14:            10000
>>>    x15:                1
>>>    x16:            10000
>>>    x17: ffff000000f350dc
>>>    x18: ffff00000110d180
>>>    x19: ffff00000110d248
>>>    x20:                0
>>>    x21: ffff00009a79f000
>>>    x22: ffffa000010b0a00
>>>    x23: ffffa000010a2880
>>>    x24: ffffa0000116da00
>>>    x25: ffff000000b4fd78
>>>    x26: ffffa0000116db00
>>>    x27: ffff000000e83000
>>>    x28:                7
>>>    x29: ffff00000110d190
>>>     sp: ffff00000110d180
>>>     lr: ffff00000077520c
>>>    elr: ffff0000007a03ac
>>> spsr:         200000c5
>>>    far:                0
>>>    esr:         96000004
>>> panic: vm_fault failed: ffff0000007a03ac error 1
>>> cpuid = 0
>>> time = 1
>>> KDB: stack backtrace:
>>> db_trace_self() at db_trace_self
>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
>>> vpanic() at vpanic+0x184
>>> panic() at panic+0x44
>>> data_abort() at data_abort+0x23c
>>> handle_el1h_sync() at handle_el1h_sync+0x78
>>> --- exception, esr 0x96000004
>>> iommu_map_msi() at iommu_map_msi+0x20
>>> gicv3_iommu_init() at gicv3_iommu_init+0x4c
>>> intr_alloc_msix() at intr_alloc_msix+0x13c
>>> rk_pcie_alloc_msix() at rk_pcie_alloc_msix+0xfc
>>> pci_alloc_msix_method() at pci_alloc_msix_method+0x1a8
>>> nvme_pci_attach() at nvme_pci_attach+0x378
>>> device_attach() at device_attach+0x400
>>> device_probe_and_attach() at device_probe_and_attach+0x7c
>>> bus_generic_attach() at bus_generic_attach+0x18
>>> pci_attach() at pci_attach+0xe8
>>> device_attach() at device_attach+0x400
>>> device_probe_and_attach() at device_probe_and_attach+0x7c
>>> bus_generic_attach() at bus_generic_attach+0x18
>>> device_attach() at device_attach+0x400
>>> device_probe_and_attach() at device_probe_and_attach+0x7c
>>> bus_generic_attach() at bus_generic_attach+0x18
>>> pci_attach() at pci_attach+0xe8
>>> device_attach() at device_attach+0x400
>>> device_probe_and_attach() at device_probe_and_attach+0x7c
>>> bus_generic_attach() at bus_generic_attach+0x18
>>> rk_pcie_attach() at rk_pcie_attach+0x14cc
>>> device_attach() at device_attach+0x400
>>> device_probe_and_attach() at device_probe_and_attach+0x7c
>>> bus_generic_new_pass() at bus_generic_new_pass+0xf8
>>> bus_generic_new_pass() at bus_generic_new_pass+0xa8
>>> bus_generic_new_pass() at bus_generic_new_pass+0xa8
>>> bus_set_pass() at bus_set_pass+0x4c
>>> mi_startup() at mi_startup+0x12c
>>> virtdone() at virtdone+0x6c
>>>
>>   That's an old commit. Did you see this panic only recently or ?
>>
> 


Even on stable/13-n247374-9faebc1e664d-dirty

I get the same backtrace when IOMMU is enabled and the nvme is attached.

pcib1: <PCI-PCI bridge> at device 0.0 on pci0
pcib0: failed to reserve resource for pcib1
pcib1: failed to allocate initial memory window: 0-0xfffff
pci1: <PCI bus> on pcib1
nvme0: <Generic NVMe Device> at device 0.0 on pci1
Fatal data abort:
   x0:                0
   x1:             1000
   x2:            10040
   x3:             2000
   x4:                1
   x5: ffff00009a99e160
   x6: 1400000000000000
   x7:   10000000000000
   x8:             1160
   x9: ffff000000cd7cc0
  x10:                0
  x11: ffff000000d89540
  x12: ffff000000d894c0
  x13:                1
  x14:            10000
  x15:                1
  x16:            10000
  x17:                0
  x18: ffff000000f5c250
  x19: ffff000000f5c318
  x20:                0
  x21: ffff00009a99d000
  x22: ffffa00000f06200
  x23: ffffa00000f49700
  x24: ffffa00000f8f500
  x25: ffff0000009b85f8
  x26: ffffa00000f8f600
  x27: ffff000000cd7000
  x28:                7
  x29: ffff000000f5c260
   sp: ffff000000f5c250
   lr: ffff0000006bf3dc
  elr: ffff0000006e15d0
spsr:         600001c5
  far:                0
  esr:         96000004
panic: vm_fault failed: ffff0000006e15d0
cpuid = 0
time = 1
KDB: stack backtrace:
#0 0xffff00000047c304 at kdb_backtrace+0x60
#1 0xffff000000437fd8 at vpanic+0x184
#2 0xffff000000437e50 at panic+0x44
#3 0xffff0000006d692c at data_abort+0x204
#4 0xffff0000006bb874 at handle_el1h_sync+0x74
#5 0xffff0000006bf3d8 at gicv3_iommu_init+0x4c
#6 0xffff0000006bf3d8 at gicv3_iommu_init+0x4c
#7 0xffff0000006b1940 at intr_alloc_msix+0x110
#8 0xffff0000007860c0 at rk_pcie_alloc_msix+0xfc
#9 0xffff000000219bbc at pci_alloc_msix_method+0x1a8
#10 0xffff00000020ba64 at nvme_pci_attach+0x378
#11 0xffff00000046bd80 at device_attach+0x400
#12 0xffff00000046d14c at bus_generic_attach+0x4c
#13 0xffff000000221f30 at pci_attach+0xe0
#14 0xffff00000046bd80 at device_attach+0x400
#15 0xffff00000046d14c at bus_generic_attach+0x4c
#16 0xffff00000046bd80 at device_attach+0x400
#17 0xffff00000046d14c at bus_generic_attach+0x4c
Uptime: 1s