[Bug 216493] [Hyper-V] Mellanox ConnectX-3 VF driver can't work when FreeBSD runs on Hyper-V 2016

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Thu Jan 26 13:08:07 UTC 2017


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=216493

            Bug ID: 216493
           Summary: [Hyper-V] Mellanox ConnectX-3 VF driver can't work
                    when FreeBSD runs on Hyper-V 2016
           Product: Base System
           Version: CURRENT
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: freebsd-bugs at FreeBSD.org
          Reporter: decui at microsoft.com

Windows Server 2016 (Hyper-V 2016) has the ability to support PCIe pass-through
and NIC SR-IOV for non-Windows virtual machines (VMs) like Linux and FreeBSD
VMs. A few months ago, we enabled PCIe pass-through for FreeBSD VM running on
Hyper-V and successfully assigned a Mellanox ConnectX-3 PF device to the VM and
the device worked fine in the VM. 

Now we have added code to support NIC SR-IOV (which is based on PCIe
pass-through) in the Hyper-V hv_netvsc driver, but it turned out the VF driver
failed to load, so I ported two patches from Linux:
https://reviews.freebsd.org/D8867
https://reviews.freebsd.org/D8868

(Note: I only tested the PF/VF drivers in FreeBSD VM running on Hyper-V, but I
didn’t test them with the patches on a bare metal FreeBSD machine (it’s not so
easy to install such a FreeBSD machine in our lab for now), so it would be
really helpful & important if people could review the patches and help to test
bare metal.)

With the 2 patches, the VF driver worked in my limited test.

BTW, this link (https://community.mellanox.com/docs/DOC-2242) shows how to
enable Mellanox ConnectX-3 VF for Windows VM running on Hyper-V 2012 R2. What I
did to FreeBSD VM on Hyper-V 2016 is pretty similar. 


Next, I did more testing and identified 4 issues we need to address:
1. When the VF is hot removed, I see the below error, but it looks nonfatal,
because later when the VF is hot added, it can still work.

mlx4_core0: Failed to free mtt range at:20769 order:0
mlx4_core0: detached


2. The VF works fine when the VM has <=12 virtual CPUs, but if the VM has >=13
vCPUs, the VF driver fails to load:

  mlx4_core0: <mlx4_core> at device 2.0 on pci1
  mlx4_core: Initializing mlx4_core: Mellanox ConnectX VPI driver v2.1.6
  vmbus0: allocated type 3 (0xfe0800000-0xfe0ffffff) for rid 18 of mlx4_core0
  mlx4_core0: Lazy allocation of 0x800000 bytes rid 0x18 type 3 at 0xfe0800000
  mlx4_core0: Detected virtual function - running in slave mode
  mlx4_core0: Sending reset
  mlx4_core0: Sending vhcr0
  mlx4_core0: HCA minimum page size:512
  mlx4_core0: Timestamping is not supported in slave mode.
  mlx4_core0: attempting to allocate 20 MSI-X vectors (52 supported)
  mlx4_core0: using IRQs 256-275 for MSI-X
  mlx4_core0: Failed to allocate mtts for 1024 pages(order 10)
  mlx4_core0: Failed to initialize event queue table (err=-12), aborting.


3. The VF can't ping other VM's VF on the same host, and can't ping the PF on
the same host either.

On the same host,
    Windows VM <-> Windows VM
and 
    Windows VM <-> Linux VM
are both OK.

Only FreeBSD VM <-> Windows/Linux VMs  can't work.

I suspect something is wrong or missing in the mlx4 VF driver in FreeBSD.


4. I got the below when Live Migration didn’t work. It seems the VF’s detach
method couldn’t finish successfully.

Jan 11 19:16:43 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel
is not idle. My toggle is 0 (op: 0x5)
Jan 11 19:16:43 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST
commandopcode FREE_RES (0xf01)
Jan 11 19:16:43 decui-b11 kernel: mlx4_core0: Failed to free mtt range at:5937
order:0
Jan 11 19:16:54 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel
is not idle. My toggle is 0 (op: 0x5)
Jan 11 19:16:54 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST
commandopcode CLOSE_PORT (0xa)
Jan 11 19:18:04 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel
is not idle. My toggle is 0 (op: 0x5)
Jan 11 19:18:04 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST
commandopcode FREE_RES (0xf01)
Jan 11 19:19:14 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel
is not idle. My toggle is 0 (op: 0x5)
Jan 11 19:19:14 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST
commandopcode QP_FLOW_STEERING_DETACH (0x66)
Jan 11 19:19:14 decui-b11 kernel: mlx4_core0: Fail to detach network rule.
registration id = 0x9000000000002
Jan 11 19:20:24 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel
is not idle. My toggle is 0 (op: 0x5)
Jan 11 19:20:24 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST
commandopcode QP_FLOW_STEERING_DETACH (0x66)
Jan 11 19:20:24 decui-b11 kernel: mlx4_core0: Fail to detach network rule.
registration id = 0x9000000000003
Jan 11 19:21:34 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel
is not idle. My toggle is 0 (op: 0x5)
Jan 11 19:21:34 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST
commandopcode QP_FLOW_STEERING_DETACH (0x66)
Jan 11 19:21:34 decui-b11 kernel: mlx4_core0: Fail to detach network rule.
registration id = 0x9000000000004
Jan 11 19:22:46 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel
is not idle. My toggle is 0 (op: 0x5)
Jan 11 19:22:46 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST
commandopcode QP_FLOW_STEERING_DETACH (0x66)
Jan 11 19:22:46 decui-b11 kernel: mlx4_core0: Fail to detach network rule.
registration id = 0x9000000000005
Jan 11 19:23:56 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel
is not idle. My toggle is 0 (op: 0x5)
Jan 11 19:23:56 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST
commandopcode QP_FLOW_STEERING_DETACH (0x66)
Jan 11 19:23:56 decui-b11 kernel: mlx4_core0: Fail to detach network rule.
registration id = 0x9000000000006
Jan 11 19:25:06 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel
is not idle. My toggle is 0 (op: 0x5)
Jan 11 19:25:06 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST
commandopcode QP_FLOW_STEERING_DETACH (0x66)
Jan 11 19:25:06 decui-b11 kernel: mlx4_core0: Fail to detach network rule.
registration id = 0x9000000000007
Jan 11 19:26:16 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel
is not idle. My toggle is 0 (op: 0x5)
Jan 11 19:26:16 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST
commandopcode SET_MCAST_FLTR (0x48)
Jan 11 19:27:26 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel
is not idle. My toggle is 0 (op: 0x5)
Jan 11 19:27:26 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST
commandopcode FREE_RES (0xf01)
Jan 11 19:27:26 decui-b11 kernel: mlx4_core0: Failed to free icm of qp:2279
Jan 11 19:28:36 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel
is not idle. My toggle is 0 (op: 0x5)
Jan 11 19:28:36 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST
commandopcode FREE_RES (0xf01)
Jan 11 19:28:36 decui-b11 kernel: mlx4_core0: Failed to release qp range
base:2279 cnt:1
Jan 11 19:29:46 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel
is not idle. My toggle is 0 (op: 0x5)
Jan 11 19:29:46 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST
commandopcode 2RST_QP (0x21)
Jan 11 19:30:56 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel
is not idle. My toggle is 0 (op: 0x5)
Jan 11 19:30:56 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST
commandopcode HW2SW_CQ (0x17)
Jan 11 19:30:56 decui-b11 kernel: mlx4_core0: HW2SW_CQ failed (-35) for CQN
0000b5
Jan 11 19:32:06 decui-b11 kernel: mlx4_core0: mlx4_comm_cmd_wait: Comm channel
is not idle. My toggle is 0 (op: 0x5)
Jan 11 19:32:06 decui-b11 kernel: mlx4_core0: failed execution of VHCR_POST
commandopcode FREE_RES (0xf01)
Jan 11 19:32:06 decui-b11 kernel: mlx4_core0: Failed freeing cq:181

More info about issue 4:

In the case of Live Migration, it looks the host just rescinds the VF by force
without sending the PCI_EJECT message to the VM. It looks the current Mellanox
VF driver in FreeBSD can’t handle this case (i.e. the VF device disappears
suddenly) and always hangs due to command timeout, because at that time the
host denies the VM’s access to the VF.  

BTW, the VF driver in Linux VM doesn’t hang and it looks Live Migration can
work, but the driver also prints out these scary messages:

Jan 26 02:40:06 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: Internal error
detected on the communication channel
Jan 26 02:40:06 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: device is going to
be reset
Jan 26 02:40:06 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: VF reset is not
needed
Jan 26 02:40:06 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: device was reset
successfully
Jan 26 02:40:06 decui-lin-vm kernel: mlx4_en 99bb:00:02.0: Internal error
detected, restarting device
Jan 26 02:40:06 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: command 0x5
failed: fw status = 0x1
Jan 26 02:40:06 decui-lin-vm kernel: hv_netvsc vmbus_16 eth1: VF down:
enP39355p0s2
Jan 26 02:40:06 decui-lin-vm kernel: hv_netvsc vmbus_16 eth1: Data path
switched from VF: enP39355p0s2
Jan 26 02:40:06 decui-lin-vm kernel: hv_netvsc vmbus_16 eth1: VF unregistering:
enP39355p0s2

Jan 26 02:40:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: Failed to close
slave function
Jan 26 02:40:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: Detected virtual
function - running in slave mode
Jan 26 02:40:37 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: recovering from
previously mis-behaved VM
Jan 26 02:41:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: Communication
channel is offline.
Jan 26 02:41:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: PF is not
responsive, skipping initialization
Jan 26 02:41:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: Failed to
initialize slave
Jan 26 02:41:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: mlx4_restart_one:
ERROR: mlx4_load_one failed, pci_name=99bb:00:02.0, err=-5
Jan 26 02:41:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: mlx4_restart_one
was ended, ret=-5
Jan 26 02:41:07 decui-lin-vm kernel: mlx4_core 99bb:00:02.0: mlx4_remove_one:
interface is down

I think at least we need to port this patch
“net/mlx4_core: Enable device recovery flow with SRIOV “
(https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=55ad359225b2232b9b8f04a0dfa169bd3a7d86d2)
from Linux to FreeBSD.

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list