[Bug 241774] FreeBSD 11.3 & 12.0 has broken SCSI & Networking on KVM/QEMU Q35 with OVMF

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Thu Jan 9 08:44:25 UTC 2020


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=241774

--- Comment #18 from John Hartley <drum at graphica.com.au> ---
(In reply to Tommy P from comment #16)

Hi Tommy P,

Hooray great to see you have got VirtIO sorted !!

I have now got my mutant 11.3 with 11.2 network kernel going and ...

It works !!

uname -a
FreeBSD newt.in.graphica.com.au 11.3-RELEASE FreeBSD 11.3-RELEASE #26: Thu Jan 
9 18:38:41 AEDT 2020    
root at newt.in.graphica.com.au:/usr/obj/usr/src/sys/GENERIC2  amd64

Only tested e1000 at the moment:

% ifconfig -a
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
        inet 127.0.0.1 netmask 0xff000000
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        groups: lo
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
       
options=209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC>
        ether 52:54:00:4e:50:91
        hwaddr 52:54:00:4e:50:91
        inet 192.168.73.131 netmask 0xffffff00 broadcast 192.168.73.255
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active

My approach was a bit sledge hammer as I have ended up with the following
modules all being regressed back to 11.2:


 % find . -name '*.bak*' -print
./sbin/ipfw.bak
./sys/modules/mlx5en.bak
./sys/modules/netmap/Makefile.bak
./sys/modules/cxgbe.bak
./sys/modules/cxgbe.bak/if_cxgbe/Makefile.bak
./sys/modules/ipfw_nat64.bak
./sys/modules/mlx5.bak
./sys/modules/urtwn.bak
./sys/amd64/vmm.bak
./sys/amd64/pci.bak
./sys/dev/virtio/network.bak
./sys/dev/vmware/vmxnet3.bak
./sys/dev/pci/pcireg.h.bak.11.2
./sys/dev/cxgbe/firmware.bak.11.3
./sys/dev/usb/wlan.bak
./sys/dev/pci.bak
./sys/dev/netmap.bak
./sys/dev/e1000.bak
./sys/dev/ixgbe.bak
./sys/dev/re.bak
./sys/dev/iwi.bak
./sys/dev/malo.bak
./sys/dev/mwl.bak
./sys/dev/ral.bak
./sys/dev/ixl.bak
./sys/dev/bwi.bak
./sys/dev/cxgbe.bak
./sys/dev/mlx5.bak
./sys/dev/oce.bak
./sys/dev/rtwn.bak
./sys/dev/urtwn.bak
./sys/conf/files.bak
./sys/net.bak
./sys/netinet.bak
./sys/net80211.bak
./sys/netinet6.bak
./sys/netpfil.bak

The need to regress so much was due ripple up / down impacts of /dev/netmap .
As you can see this is pretty much then entire network subsystem... :-(

I am only able to post this as now network is up I can ssh into box to get
result out.

First though I am going to test all the other network devices...

Now done:

% ifconfig -a
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
        inet 127.0.0.1 netmask 0xff000000
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        groups: lo
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
       
options=209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC>
        ether 52:54:00:4e:50:91
        hwaddr 52:54:00:4e:50:91
        inet 192.168.73.131 netmask 0xffffff00 broadcast 192.168.73.255
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
vmx0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
       
options=60039b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,TSO6,RXCSUM_IPV6,TXCSUM_IPV6>
        ether 52:54:00:0a:cc:0d
        hwaddr 52:54:00:0a:cc:0d
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect
        status: active
re0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
       
options=8009b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE>
        ether 52:54:00:b2:20:53
        hwaddr 52:54:00:b2:20:53
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (100baseTX
<full-duplex,flowcontrol,rxpause,txpause>)
        status: active


So I have:

e1000 - intel - em0
vmxnet3 - VMWare vmxnet3 Virtual NIC driver (VirtIO alternate) - vmx0
rtl8139 - RealTec - re0

This test also include VirtIO, but as per your testing this does not work and I
need to include your fix.

So in summary:

Issue is not with QEMU / KVM , but with FreeBSD code from 11.2 -> 11.3 which
12.x has inherited

There are two bugs:

1. confirmed - VirtIO - the one Tommy T has helped resolve
2. Speculatively - /dev/netmap bug or else where

Bug (2) is that is breaking on the other Q35 (non VirtIO) interfaces (e1000,
vmxnet3, rtl8139).

Now that I have a working base I am going for first move forward the PCI code
and validate that is ok and then look at the netmap code, which I believe is
what is causing the issue with all the other network devices.

Once I get to specific commit/s that introduced the bug I will provide update.

BTW - I am really surprise that such a large and impactful change on network
sub-system was part of minor release cycle and nothing was mentioned in the
release notes.

Cheers and thanks again Tommy.

John Hartley.

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-virtualization mailing list