[Bug 260973] pf: firewall rules stop matching when vnet jails share interface names with the host

From: <bugzilla-noreply_at_freebsd.org>
Date: Thu, 06 Jan 2022 09:14:12 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260973

            Bug ID: 260973
           Summary: pf: firewall rules stop matching when vnet jails share
                    interface names with the host
           Product: Base System
           Version: 13.0-STABLE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: thomas@gibfest.dk

Hello,

I've been building a new vnet jailhost on 13 and I am hitting a weird issue
where pf stops permitting traffic it clearly has rules to allow after
interfaces inside vnet jails are renamed to the same name as the host interface
with the pf rule.

This is on FreeBSD nuc1.servers.bornhack.org 13.0-STABLE FreeBSD 13.0-STABLE #1
stable/13-d208638c5: Wed Jan  5 13:32:08 UTC 2022    
root@nuc1.servers.bornhack.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64

The complete ruleset is pretty complex but I've managed to cook it down to a
few lines:

[tykling@nuc1 ~]$ cat testpf.conf 
block log all
set skip on lo0
pass in quick on { em0 } proto { tcp } from { 85.235.250.87 } to { (em0) } port
{ 22 }
[tykling@nuc1 ~]$ 

The host has an em0 interface:

[tykling@nuc1 ~]$ ifconfig em0
em0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
       
options=481049b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,VLAN_HWFILTER,NOMAP>
        ether 1c:69:7a:ab:fe:be
        inet 85.209.118.130/28 broadcast 85.209.118.143
        inet6 fe80::1e69:7aff:feab:febe%em0/64 scopeid 0x1
        inet6 2a09:94c4:55d1:7680::82/64
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
[tykling@nuc1 ~]$ 

The issue seems to be triggered by renaming epair interfaces inside vnet jails
to the same name as an interface on the host.

The above pf ruleset works and keeps working if I don't start any vnet jails.
It also keeps working if I start vnet jails but don't rename interfaces. It
also keeps working if I start vnet jails but rename the interfaces to something
other than em0.

Existing states established before the issue happens keep working (I am working
remote via ssh on the server), but new states seem to just ignore the permit
rule on em0, and the traffic gets blocked even though a rule should permit it:

06:08:46.357935 rule 0/0(match): block in on em0: 85.235.250.87.40108 >
85.209.118.130.22: Flags [S], seq 909787121, win 65535, options [mss
1460,nop,wscale 6,sackOK,TS val 799486870 ecr 0], length 0
06:08:47.358590 rule 0/0(match): block in on em0: 85.235.250.87.40108 >
85.209.118.130.22: Flags [S], seq 909787121, win 65535, options [mss
1460,nop,wscale 6,sackOK,TS val 799487870 ecr 0], length 0
06:08:49.557897 rule 0/0(match): block in on em0: 85.235.250.87.40108 >
85.209.118.130.22: Flags [S], seq 909787121, win 65535, options [mss
1460,nop,wscale 6,sackOK,TS val 799490070 ecr 0], length 0

A wild guess as to the reason might be a race leading to some confusion over
which em0 interface is which?

Some more observations:
- It didn't seem to happen with just one vnet jail when I tried narrowing it
down. Enabling and starting three more made the problem occur almost instantly.
- Rebooting with four jails plus the above ruleset enabled means never getting
any contact to the server at all (ie. the problem manifests from boot).
- Results with two jails were less consistent. The number of jails/interface
renames seem to play a role in whether or not the issue is triggered.
- A "service jail restart" will trigger it almost instantly if it doesn't
happen right away.
- Renaming interfaces to something other than "em0" also works without any
issues.

I hope reproducing will be possible, I've included the jail.conf file for one
of the jails below:

[tykling@nuc1 ~]$ cat /var/run/jail.syslog1_servers_bornhack_org.conf
# Generated by rc.d/jail at 2022-01-06 08:19:08
syslog1_servers_bornhack_org {
        host.hostname = "syslog1.servers.bornhack.org";
        path = "/usr/jails/syslog1.servers.bornhack.org";
        vnet;
        vnet.interface = "epair2b";
        exec.clean;
        exec.system_user = "root";
        exec.jail_user = "root";
        exec.prestart += "ifconfig epair2a destroy 2>/dev/null || true &&
ifconfig epair2 create up && ifconfig epair2a up && ifconfig bridge1 addm
epair2a up";
        exec.start += "/sbin/ifconfig epair2b name em0 && ifconfig em0
10.1.0.3/24 && ifconfig em0 inet6 2a09:94c4:55d1:76A0::3/64";
        exec.start += "route add -inet default 10.1.0.1";
        exec.start += "route add -inet6 default 2a09:94c4:55d1:76A0::1";
        exec.poststop += "ifconfig bridge1 deletem epair2a && ifconfig epair2a
destroy";
        exec.start += "/bin/sh /etc/rc";
        exec.stop = "/bin/sh /etc/rc.shutdown jail";
        exec.consolelog =
"/var/log/jail_syslog1_servers_bornhack_org_console.log";
        mount.fstab = "/etc/fstab.syslog1_servers_bornhack_org";
        allow.set_hostname = 0;
        allow.sysvipc = 0;
        enforce_statfs = "2";
}
[tykling@nuc1 ~]$

The interesting sections I guess are:
- in exec.prestart (on the host) where the epair interface is destroyed,
recreated and added to a bridge
- and in exec.start (inside the jail) where the interface is renamed to em0 and
then configured with v4 and v6.
I've included the whole thing in case it is useful to someone.

I hope someone is able to reproduce, if not then I will have to narrow it down
further, please let me know. I have run out of time for now.

Thanks! :)

-- 
You are receiving this mail because:
You are the assignee for the bug.