FreeBSD serious problems with vnet on if_bridge (probably ARP related?)

From: FreeBSD User <>
Date: Sun, 15 May 2022 11:40:57 UTC

I ran into serious trouble setting up a FreeBSD 12.3-RELEASE-p5 host having a second NIC
and vnet jails attached to that second NIC (basically, the host is a recent Xigmanas with
Bastille jails). Hopefully, I can gather some answeres here.

The host is comprised of two NICs, em0 (management only) and igb0 (service/jails).
Both, the server and the jails as well as the igb0 interface are residing on the same
network, but both NICs are connected to two different ports on the switch (I assume, we
do not have access to the campus network inrastructure).

Both NICs are attached with an IPv4 of the same network, the host is listening on both
NICs for services, port 22 (ssh) for instance, other services are supposed to be bound to
the second NIC (igb0, like NFS, SMB and so on). em0 is only listening for 22/tcp and
443/tcp as it is meant for management only.
igb0 is member of a bridge as well as the vnet interfaces (epair) of thr jails, created
via "jib" (the host is basically a recent XigmaNAS The igb0 NIC has also
an IP from the LAN - some advices to do so, other advice to avoid setting an IP, but the
management interface of XigmaNAS forces me to apply an IP. Just for the record.

Problem: it seems that in a non predictable way connections are droped, ARP packages
reach the vnet and in other cases ARP doesn't reach a vnet. The phenomenon is weird. The
host is running 6 jails with regular FQDN and IPv4 addresses. All jails are up, the base
NIC (igb0) is also up and running.
It is possible, in rare cases, to connect via ssh frome remote sites to at most two(!) of
the jails. It is impossible to predict which jail is connecting first and it is like a
lottery which one will respond.
Pinging the jail from the LAN or remote sites is also weird: as with the ssh connection,
sometimes a jail responds immediately as expected, in some other cases it takes 10 - 30
seconds until ping starts to report replies.

Connecting to the base host has never been a problem, so i assume the base network to be
all right.
Connecting to a jail locally via "jexec -l" and performing some tcpdump invstigations
reveals weird results:

tcpdump -vi vnet0 arp

does show ARP a plethora of packets from the local network on a jail that is reposnding
quickly to ping or is giving access to ssh, but on those jails which are resilient to
connection and ping attempts, ARP packets seem to vanish.
By having screens/terminals adjacent of those jails while pinnging and tcpdumping, its
obvious that not all "arp: who-has ... tell" requests are evenly distributed to all
bridge members as I'd expect.

Following some advices found on the web, the following sysctl settings are provided to
if_bridge (main host): 

device	if_bridge 0 0 0 0 0 0 0 0 0

I also fiddled around with disabling LRO, TSO, RXCSUM and TXCSUM on all physicsal NICs as
somewhere recommended due to buggy FreeBSD NIC driver, but without any help. Please see
below for some physical details to the NICs on the box.

In another department I've setup a similar box (also XigmaNAS), but this time the second
NIC is a different type (i350-T2). The physical port which is member of the if_bridge on
which the vnet epair of the jails are residing, is also setup with an IPv4 address (as
described above), but member of a different network and does not share the same
switch/network with the dedicated management port. This box's jails are acting and
responding as expected.

Thanks in advance,

O. Hartmann

igb0@pci0:4:0:0:	class=0x020000 card=0x00028086 chip=0x15338086 rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'I210 Gigabit Network Connection'
    class      = network
    subclass   = ethernet
    bar   [10] = type Memory, range 32, base 0xf7900000, size 1048576, enabled
    bar   [1c] = type Memory, range 32, base 0xf7a00000, size 16384, enabled
    cap 01[40] = powerspec 3  supports D0 D3  current D0
    cap 05[50] = MSI supports 1 message, 64 bit, vector masks 
    cap 11[70] = MSI-X supports 5 messages, enabled
                 Table in map 0x1c[0x0], PBA in map 0x1c[0x2000]
    cap 10[a0] = PCI-Express 2 endpoint max data 128(512) FLR NS
                 max read 512
                 link x1(x1) speed 2.5(2.5) ASPM L1(L0s/L1)
    ecap 0001[100] = AER 2 0 fatal 0 non-fatal 0 corrected
    ecap 0003[140] = Serial 1 somenumber
    ecap 0017[1a0] = TPH Requester 1

m0@pci0:0:25:0:	class=0x020000 card=0x20528086 chip=0x153b8086 rev=0x04 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Connection I217-V'
    class      = network
    subclass   = ethernet
    bar   [10] = type Memory, range 32, base 0xf7d00000, size 131072, enabled
    bar   [14] = type Memory, range 32, base 0xf7d35000, size 4096, enabled
    bar   [18] = type I/O Port, range 32, base 0xf080, size 32, enabled
    cap 01[c8] = powerspec 2  supports D0 D3  current D0
    cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
    cap 13[e0] = PCI Advanced Features: FLR TP