problem with pass-through on amd

Andriy Gapon avg at FreeBSD.org
Thu Nov 16 20:45:41 UTC 2017


On 14/11/2017 06:22, Anish wrote:
> Also ivhd has fault interrupt enabled which is very helpful in debugging:
> 
> [root at ryzen /home/anish/FreeBSD/head]# vmstat -ia |grep ivh
> irq256: ivhd0:fault                    0          0
> irq257: ivhd1:fault                    0          0
> 

Anish,

I have made several interesting discoveries regarding my problem.
One of them is that actually there were some IOMMU log events:

dev.ivhd.0.event_tail: 240
dev.ivhd.0.event_head: 0
dev.ivhd.0.event_intr_count: 0

But there were no interrupts and the events are unconsumed and unreported.
I examined MSI configuration of the IOMMU PCI device and the address and data
registers were zeroed out.
I looked at dmesg and at the code and I realized why that happened.

So, first of all, I pre-load vmm via loader.conf.  Probably as a result of that
the ivhd device attaches before any bridges and buses on my system.  And
amdvi_alloc_intr_resources() does a rather untypical thing, it
configures an MSI for a PCI device by directly writing to its configuration
registers.  The PCI bus code is completely unaware of those changes and it wipes
them out in pci_add_child() -> pci_cfg_restore().

Also, I think that even if ivhd attached after the root PCI bus, then what it
does would be still unsafe.  I think that, for example, a suspend-resume cycle
would wipe out the MSI configuration too.
I think that in that case we should better use pci methods to configure MSI.

Now, why does ivhd attach before the root Host-PCI bridge and what can we do to
fix the order?

ivrs_drv.c has this code:
/*
 * Load this module at the end after PCI re-probing to configure interrupt.
 */
DRIVER_MODULE_ORDERED(ivhd, acpi, ivhd_driver, ivhd_devclass, 0, 0,
                      SI_ORDER_ANY);

But apparently this SI_ORDER_ANY does not help much.
It affects only the driver registration order, but not the device probe and
attachment order.

This code is far more significant:
                ivhd_devs[i] = BUS_ADD_CHILD(parent, 1, "ivhd", i);

ivhd passes 1 as the order.
This is a very high order for the acpi bus.
As a comment in acpi_probe_child() says:
            /*
             * Create a placeholder device for this node.  Sort the
             * placeholder so that the probe/attach passes will run
             * breadth-first.  Orders less than ACPI_DEV_BASE_ORDER
             * are reserved for special objects (i.e., system
             * resources).
             */
where ACPI_DEV_BASE_ORDER is 100.

For example, order of the Host-PCI bridge on my system is 120.

I must note that this is important only of vmm is preloaded (which is probably
not an extremely rare case).  If vmm is loaded after the system is booted then,
of course, ivhd will be probed after the PCI buses / bridges.

-- 
Andriy Gapon


More information about the freebsd-virtualization mailing list