lagg(4) + VLAN + if_bridge(4) vs. ARP

John Nielsen lists at jnielsen.net
Fri Jan 8 21:52:57 UTC 2016


Hi all-

I'm trying to troubleshoot a problem on a machine running recent 10-STABLE. The machine has two physical interfaces and hosts a number of services, including a bhyve VM (FreeBSD 10.2-RELEASE) acting as a network appliance. The VM has three interfaces: external, internal-trusted and internal-guest. Each VM interface is plumbed to a TAP device on the host which in turn is a member of a bridge. Here is the current (working) setup:

External <--------> Host <-> Host <-> Host <-> VM
port                re0      bridge2  tap21    vtnet1

Switch <-> Host <-> Host <-> Host <-> Host <-> VM
port       em0      em0.2    bridge0  tap20    vtnet0
            ^
            \-----> Host <-> Host <-> Host <-> VM
                    em0.103  bridge1  tap22    vtnet2

Since there is not much external traffic, most of the bandwidth potential of re0 is wasted while em0 is sometimes busy. So I'd like to move to a LAGG setup, as below:

External  Trusted  Untrusted
VLAN 99   VLAN 2   VLAN 103
  |         |        |
  \         |       /
   /---------------\   /------> Host <--> Host <-> Host <-> VM
   |     switch    |   |        lagg0.99  bridge2  tap21    vtnet1
   \---------------/   |
       |    |          |  /---> Host <--> Host <-> Host <-> VM
       |    v          |  |     lagg0.2   bridge0  tap20    vtnet0
       |   Host        v  v
       \   re0 <-----> Host <-> Host <--> Host <-> Host <-> VM
        \              lagg0    lagg0.103 bridge1  tap22    vtnet2
         \-> Host       ^
             em0 <------/

So in other words, plugging the external port into the switch, creating a new "external" VLAN, adding both em0 and re0 into a new LAGG and creating VLAN child interfaces off of that.

I tried the new setup today and it worked except that the VM no longer received ARP replies from the external network. Using tcpdump on the host's lagg0.99, I saw the ARP request from the VM go out and an ARP reply come back, but that's as far as it went. I did not see the arp reply on the host's bridge2 or tap21 interfaces, and the VM never received it.

I didn't make any changes on the VM, and all I changed on the host was the networking via /etc/rc.conf. The host does run ipfw but I verified that none of the rules reference any stale interface names. I have also previously disabled all firewalling of bridged packets:
  net.link.bridge.pfil_onlyip=0
  net.link.bridge.pfil_member=0
  net.link.bridge.pfil_bridge=0

I also verified that "ifconfig bridge2 addr" contained the MAC addresses of both the VM and the external device on the correct ports.

So in the LAGG setup, why aren't the ARP replies going across bridge2 to the VM? Any ideas on how to narrow down the cause appreciated.

Thanks!

-John Nielsen



More information about the freebsd-stable mailing list