From nobody Thu Jul 28 21:58:10 2022 X-Original-To: virtualization@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Lv4Jz0VDjz4X3ps for ; Thu, 28 Jul 2022 21:58:11 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Lv4Jy4hh1z3wpF for ; Thu, 28 Jul 2022 21:58:10 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4Lv4Jy3cwsz1BKf for ; Thu, 28 Jul 2022 21:58:10 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 26SLwACH061771 for ; Thu, 28 Jul 2022 21:58:10 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 26SLwA6c061770 for virtualization@FreeBSD.org; Thu, 28 Jul 2022 21:58:10 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: virtualization@FreeBSD.org Subject: [Bug 265487] bhyve drops the PCI passthrough device in a VM if the host is suspended/resumed Date: Thu, 28 Jul 2022 21:58:10 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: bhyve X-Bugzilla-Version: 13.1-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: jon@xyinn.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: virtualization@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Discussion List-Archive: https://lists.freebsd.org/archives/freebsd-virtualization List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-virtualization@freebsd.org X-BeenThere: freebsd-virtualization@freebsd.org MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1659045490; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=sjxHFJSbhvpoCaBR4AAWaqnOX+dKfHYcmyVfFrDCsP4=; b=CFcD8s8oMll8MmHaSEC04O8ee/xMuW/x2oW18neLMiNu2+AAfzvNkYaPcRw2U1DwYtY46U a4vtSu+YQI2UZj5K0DOr9xFerhkkkxxXslv0cUlCCn8f9IEsvzIuKxYX8DaqC6Sr0MYx7W Cn1K2HCiy0JQ5ILz+ahsRl06mv+fC8iG4t+4TduwRKrTPiMXGs4aYRTygVoz94loG2Dep5 7WIRkHK2TvzB5Jfzh03WkvZGxe682exFncM0to4qMDOmWjIAYlOPLq7ORx5CJl0kbg7t74 AFybZHdlP0Y5pCQkq2HGGrDjF+nAURuHWZdoGlCKfaFWInVz2ADV1D5q9/i+qg== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1659045490; a=rsa-sha256; cv=none; b=J85Pek6Cjg3sWSML/lB3xNEsAu8N1PJMW5F95xfCHdEem0vJEiyhe/ki751acNVITdITE+ ESK1XMiV6Qzip3byUMOnBpPrJfGxNTBzNmckxFycZDo9b2r59kk6aUnzbTgEpPHBaqCUZe 6LIYYyzi+RjqOfKh7iBF+Mi4gmaqKbULpzmaF9K6LvuMhrXyN+w5adWtZno1nAPtxRqkpR UgZHhutEE4PQXWiPZGgxQWM0jfCCMUpvZmrzLfSHy/aefDXkLQ+3RQVzwQQAyvON0uPDRg er9AIz4dwwt10Oi/CnBt+FD5/OQC1o/R25TpQ1ikexEl3i59t0zEqma1lQeclw== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D265487 Bug ID: 265487 Summary: bhyve drops the PCI passthrough device in a VM if the host is suspended/resumed Product: Base System Version: 13.1-RELEASE Hardware: Any OS: Any Status: New Severity: Affects Only Me Priority: --- Component: bhyve Assignee: virtualization@FreeBSD.org Reporter: jon@xyinn.org Hello, I've been debugging an issue with the creator if wifibox ( P=C3=81LI G=C3=A1bor J=C3=A1nos) that I've been experiencing on my Thinkpad= X1 Carbon Gen 7 on 13.1-STABLE (n251926-488f9d85278, but I also reproduced it in 13.1-RELEASE - n250148-fc952ac2212 - while seeing if this was a regression, it is not). My particular scenario is using 'wifibox' to pass my wifi card to an Alpine Linux VM using bhyve. Most of the details and logs during our debugging have been posted in this bug report: https://github.com/pgj/freebsd-wifibox/issues/31 It seems that when I shut down my laptop (host) while the bhyve VM is runni= ng, it would non-deterministically (but very soon, around 1-5 times of doing suspend/resume) would resume the system successfully, but the wifi would st= op working. Upon further investigation, we noticed that the wireless interface= and the entire PCI card was completely missing from within the VM. The only way= to "fix" it was to reboot. Upon further experimentation, I noticed that if I stopped the VM, then suspended/resumed, and then started the VM again, the PCI card would be successfully available again. This workaround so far hasn't failed for me (= but I'm still testing it throughout the day). My suspicion is that when bhyve h= as passed down a PCI device and the host is suspended, the resources for that handle aren't being released correctly and the machine may be in an inconsistent state in regards to this specific passthrough device. To post some logs, we can see the following before and after inside the bhy= ve VM: Before ----------- wifibox:~# lspci 00:1f.0 Class 0601: 8086:7000 00:04.2 Class 0100: 1af4:1009 00:04.0 Class 0100: 1af4:1001 00:00.0 Class 0600: 1275:1275 00:04.3 Class 0100: 1af4:1009 00:06.0 Class 0280: 8086:02f0 00:04.1 Class 0100: 1af4:1009 00:05.0 Class 0200: 8086:100f wifibox:~# ifconfig eth0 Link encap:Ethernet HWaddr 00:A0:98:8A:05:71=20=20 inet addr:10.1.0.1 Bcast:0.0.0.0 Mask:255.0.0.0 inet6 addr: fe80::2a0:98ff:fe8a:571/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:20396 errors:0 dropped:0 overruns:0 frame:0 TX packets:33557 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000=20 RX bytes:4197622 (4.0 MiB) TX bytes:27342203 (26.0 MiB) lo Link encap:Local Loopback=20=20 inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000=20 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) wlan0 Link encap:Ethernet HWaddr F8:E4:E3:EB:35:02=20=20 inet addr:192.168.1.139 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::fae4:e3ff:feeb:3502/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:34620 errors:0 dropped:0 overruns:0 frame:0 TX packets:20133 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000=20 RX bytes:27437272 (26.1 MiB) TX bytes:4448962 (4.2 MiB) wifibox:~# dmesg | grep iwlwifi [ 0.869316] iwlwifi 0000:00:06.0: can't derive routing for PCI INT A [ 0.869318] iwlwifi 0000:00:06.0: PCI INT A: not connected [ 0.870101] iwlwifi 0000:00:06.0: Failed to set affinity mask for IRQ 41 [ 0.926465] iwlwifi 0000:00:06.0: api flags index 2 larger than supporte= d by driver [ 0.926473] iwlwifi 0000:00:06.0: TLV_FW_FSEQ_VERSION: FSEQ Version: 89.3.35.37 [ 0.926601] iwlwifi 0000:00:06.0: loaded firmware version 66.f1c864e0.0 QuZ-a0-jf-b0-66.ucode op_mode iwlmvm [ 0.946738] iwlwifi 0000:00:06.0: Detected Intel(R) Wireless-AC 9560 160= MHz, REV=3D0x354 [ 1.061975] iwlwifi 0000:00:06.0: Detected RF JF, rfid=3D0x105110 [ 1.118696] iwlwifi 0000:00:06.0: base HW address: f8:e4:e3:eb:35:02 [ 5.207325] iwlwifi 0000:00:06.0: Unhandled alg: 0x3f0707 After ------ wifibox:~# lspci 00:1f.0 Class 0601: 8086:7000 00:04.2 Class 0100: 1af4:1009 00:04.0 Class 0100: 1af4:1001 00:00.0 Class 0600: 1275:1275 00:04.3 Class 0100: 1af4:1009 00:04.1 Class 0100: 1af4:1009 00:05.0 Class 0200: 8086:100f wifibox:~# ifconfig eth0 Link encap:Ethernet HWaddr 00:A0:98:8A:05:71=20=20 inet addr:10.1.0.1 Bcast:0.0.0.0 Mask:255.0.0.0 inet6 addr: fe80::2a0:98ff:fe8a:571/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:145 errors:0 dropped:0 overruns:0 frame:0 TX packets:17 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000=20 RX bytes:143406 (140.0 KiB) TX bytes:1594 (1.5 KiB) lo Link encap:Local Loopback=20=20 inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000=20 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) wifibox:~# dmesg | grep iwlwifi Here we can see that the '00:06.0 Class 0280: 8086:02f0' has completely dissapeared. Using my "stop wifibox before sleep to release the PCI resource + start it = back up later" approach in this way: root@leslie:~ # cat /usr/local/etc/devd/wifibox.conf # This is a `devd(8)` configuration file to run the resume action of # wifibox on the ACPI resume event. Review the contents and create a # copy of it without the `.sample` extension to use it. Restart the # `devd` service once the file has been created. notify 11 { match "system" "ACPI"; match "subsystem" "Suspend"; action "logger 'Stopping wifibox before suspend' && /usr/local/sbin/wifibox stop && /etc/rc.suspend acpi $notify"; }; notify 11 { match "system" "ACPI"; match "subsystem" "Resume"; action "/etc/rc.resume acpi $notify && logger 'Starting wifibox aft= er resume and getting IP via DHCP' && /usr/local/sbin/wifibox start && /sbin/dhclient wifibox0"; }; we can see the following in the logs (correct flow on the host): bridge0: Ethernet address: 58:9c:fc:10:ff:99 bridge0: changing name to 'wifibox0' tap0: Ethernet address: 58:9c:fc:10:c0:3f tap0: promiscuous mode enabled wifibox0: link state changed to DOWN ppt0 mem 0xea238000-0xea23bfff at device 20.3 on pci0 tap0: link state changed to UP wifibox0: link state changed to UP lo0: link state changed to UP ... tap0: link state changed to DOWN wifibox0: link state changed to DOWN ppt0: detached pci0: at device 20.3 (no driver attached) ue0: link state changed to UP tap0: promiscuous mode disabled ... ppt0: detached pci0: at device 20.3 (no driver attached) ppt0 mem 0xea238000-0xea23bfff at device 20.3 on pci0 bridge0: Ethernet address: 58:9c:fc:10:ff:99 bridge0: changing name to 'wifibox0' tap0: Ethernet address: 58:9c:fc:10:c0:3f tap0: promiscuous mode enabled wifibox0: link state changed to DOWN tap0: link state changed to UP wifibox0: link state changed to UP tap0: link state changed to DOWN wifibox0: link state changed to DOWN ppt0: detached pci0: at device 20.3 (no driver attached) ppt0 mem 0xea238000-0xea23bfff at device 20.3 on pci0 tap0: link state changed to UP wifibox0: link state changed to UP tap0: link state changed to DOWN wifibox0: link state changed to DOWN ppt0: detached pci0: at device 20.3 (no driver attached) tap0: promiscuous mode disabled Regarding the devd rule for wifibox, notice that in the Suspend action, the "/etc/rc.suspend acpi $notify" call is happening at the end. I'm suspecting this is behaving like a stack (and makes sense since we are basically rever= sing stuff when we suspend/resume). When I made that call in the beginning, the = PCI passthrough issue arose, but when I placed it at the end it was fine. I'm guessing this is due to the suspend command being triggered and the wifibox section didn't have enough time to fully release the resources, thus it wou= ld yield the same issue that I mentioned before. Perhaps that rule can help further narrowing this issue down. At the moment I'm testing some slight improvements to the devd workaround t= hat=20 P=C3=81LI suggested which will help speed up the suspend/resume times and n= ot need me to explicitly call the dhclient command. Ultimately, we'll want to fix t= he bhyve issue. I don't have any experience with bhyve or its internals but I = can help with debugging across RELEASE/STABLE/CURRENT. - Jonathan --=20 You are receiving this mail because: You are the assignee for the bug.=