From nobody Sat Mar 12 14:18:38 2022 X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id A9B391A1B3F9 for ; Sat, 12 Mar 2022 14:18:52 +0000 (UTC) (envelope-from joh.hendriks@gmail.com) Received: from mail-lf1-x132.google.com (mail-lf1-x132.google.com [IPv6:2a00:1450:4864:20::132]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4KG4fg6VxDz3q7d; Sat, 12 Mar 2022 14:18:51 +0000 (UTC) (envelope-from joh.hendriks@gmail.com) Received: by mail-lf1-x132.google.com with SMTP id bu29so19881114lfb.0; Sat, 12 Mar 2022 06:18:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=QcgHdRypAu4kFitGpMCNPGdG5l36eWyheoMqSqwJbzE=; b=VE3ZvAAdlFRefAmNvilRZEc/xu6C2eFD1F6Pn+SKpzJN/pbA7VAZ4kCE8iNmIGQakd O88VU4AHerhVIJUB7zyPA8YpZbCG+5eX2Wi6wQCmd2BdfvS7ml6nhSL3wiTZA9huuNDN FIYcjaq9TmZ8xhV7GgyTzL4j36q3cuiEa92gDgfae6cuS/rG88qCSLwrfV3gyIlJIpZ/ 03e/Tc0cITFZIJ2hBDJkg6nGqCfBLyPdX8ytYim+9R6jjJj4pjsAEkO3k1Vv0MxpXjf7 sq6+pqbV1GOe2L6kpyf0bv7pOVVwrX5iSNULvj//BvWh35vjEXYp5leye1WfNwRVMJae Edyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=QcgHdRypAu4kFitGpMCNPGdG5l36eWyheoMqSqwJbzE=; b=JGiTy7kKsVMcOJSRR65G5cpyTobxanme9o9lyb69yPBs28uZ63A32xK5hXIPOwdagl w1qLdG8Tp7cejwUC9m7yZCFZZ1in7mM2OfiT5QnzlKCg5mV5HmZX39BBzQujBXyqdpoM SAcB3Lfxlntod+SMG84lKTTEazUobE60CK0yEKJgalgIrJAo2EXwsaeJlf2/mVZF8Alt hR/AH0doukC0bvfEGY/k4Y7AiCXSEzlAhUygPtiXlVo4J9CdUuHpOtjmGW3bKttFmjFC DD1MI/E6gyxtReyafdb7hqTbiJARu7LIN3MlbX3irkX/VTCwklc03j9J8Xr/Q+Ru33+A EBUg== X-Gm-Message-State: AOAM533jiob/gbiCVbgeReBWrNpk06MjIbGc0Ro2HJNFLfbMHxp73LtE 8IIuX9soOK4f8+e0g5cZd+luFoytWkeAXsW8zjRaC5o7wb0= X-Google-Smtp-Source: ABdhPJwtGLs7q1PHcUKxMCZFopsA2BjejHV808dcnzfrupkYAK1fCMimsGtPehm8BfmoJOQkT/vQv0h7rgBBfK5nYTQ= X-Received: by 2002:a05:6512:1397:b0:447:5098:2b90 with SMTP id p23-20020a056512139700b0044750982b90mr9000063lfa.75.1647094729209; Sat, 12 Mar 2022 06:18:49 -0800 (PST) List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@freebsd.org MIME-Version: 1.0 References: <41ED1534-5E98-4D46-A562-811E80F82C5F@FreeBSD.org> <43AA6B37-6235-4787-A03F-B4C264C75A58@freebsd.org> In-Reply-To: From: Johan Hendriks Date: Sat, 12 Mar 2022 15:18:38 +0100 Message-ID: Subject: Re: epair and vnet jail loose connection. To: Kristof Provost Cc: Michael Gmelin , freebsd-net@freebsd.org, ">> \\\\\\\\Patrick M. Hausen\\\\" Content-Type: multipart/alternative; boundary="0000000000003efe9f05da0620e5" X-Rspamd-Queue-Id: 4KG4fg6VxDz3q7d X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20210112 header.b=VE3ZvAAd; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of johhendriks@gmail.com designates 2a00:1450:4864:20::132 as permitted sender) smtp.mailfrom=johhendriks@gmail.com X-Spamd-Result: default: False [-3.03 / 15.00]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36:c]; FREEMAIL_FROM(0.00)[gmail.com]; MID_RHS_MATCH_FROMTLD(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+,1:~,2:~]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; TAGGED_FROM(0.00)[]; R_PARTS_DIFFER(0.77)[88.3%]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20210112]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[multipart/alternative]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::132:from]; MIME_HTML_ONLY(0.20)[]; MLMMJ_DEST(0.00)[freebsd-net]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[] X-ThisMailContainsUnwantedMimeParts: N --0000000000003efe9f05da0620e5 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable For me this minimal setup let me see the drop off of the network from the haproxy server. 2 jails, one with haproxy, one with nginx which is using the following html file to be served. Page Title

My First Heading

My first paragraph.

From a remote machine i do a hey -h2 -n 10 -c 10 -z 300s https://wp.test.n= l Then a ping on the jailhost to the haproxy shows the following [ /] > ping 10.233.185.20 PING 10.233.185.20 (10.233.185.20): 56 data bytes 64 bytes from 10.233.185.20: icmp_seq=3D0 ttl=3D64 time=3D0.054 ms 64 bytes from 10.233.185.20: icmp_seq=3D1 ttl=3D64 time=3D0.050 ms 64 bytes from 10.233.185.20: icmp_seq=3D2 ttl=3D64 time=3D0.041 ms 64 bytes from 10.233.185.20: icmp_seq=3D169 ttl=3D64 time=3D0.050 ms 64 bytes from 10.233.185.20: icmp_seq=3D170 ttl=3D64 time=3D0.154 ms 64 bytes from 10.233.185.20: icmp_seq=3D171 ttl=3D64 time=3D0.054 ms 64 bytes from 10.233.185.20: icmp_seq=3D172 ttl=3D64 time=3D0.039 ms 64 bytes from 10.233.185.20: icmp_seq=3D173 ttl=3D64 time=3D0.160 ms 64 bytes from 10.233.185.20: icmp_seq=3D174 ttl=3D64 time=3D0.045 ms ^C --- 10.233.185.20 ping statistics --- 335 packets transmitted, 175 packets received, 47.8% packet loss round-trip min/avg/max/stddev =3D 0.037/0.070/0.251/0.040 ms ifconfig vtnet0: flags=3D8963 metric= 0 mtu 1500 options=3D4c00bb ether 56:16:e9:80:5e:41 inet 87.233.191.146 netmask 0xfffffff0 broadcast 87.233.191.159 inet 87.233.191.156 netmask 0xffffffff broadcast 87.233.191.156 inet 87.233.191.155 netmask 0xffffffff broadcast 87.233.191.155 inet 87.233.191.154 netmask 0xffffffff broadcast 87.233.191.154 media: Ethernet autoselect (10Gbase-T ) status: active nd6 options=3D29 vtnet1: flags=3D8863 metric 0 mtu 1= 500 options=3D4c07bb ether 56:16:2c:64:32:35 media: Ethernet autoselect (10Gbase-T ) status: active nd6 options=3D29 lo0: flags=3D8049 metric 0 mtu 16384 options=3D680003 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 inet 127.0.0.1 netmask 0xff000000 groups: lo nd6 options=3D21 bridge0: flags=3D8843 metric 0 mtu 1500 ether 58:9c:fc:10:ff:82 inet 10.233.185.1 netmask 0xffffff00 broadcast 10.233.185.255 id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 member: epair20a flags=3D143 ifmaxaddr 0 port 7 priority 128 path cost 2000 member: epair18a flags=3D143 ifmaxaddr 0 port 15 priority 128 path cost 2000 groups: bridge nd6 options=3D9 bridge1: flags=3D8843 metric 0 mtu 1500 ether 58:9c:fc:10:d9:1a id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 member: vtnet0 flags=3D143 ifmaxaddr 0 port 1 priority 128 path cost 2000 groups: bridge nd6 options=3D9 pflog0: flags=3D141 metric 0 mtu 33160 groups: pflog epair18a: flags=3D8963 metr= ic 0 mtu 1500 description: jail_web01 options=3D8 ether 02:77:ea:19:c7:0a groups: epair media: Ethernet 10Gbase-T (10Gbase-T ) status: active nd6 options=3D29 epair20a: flags=3D8963 metr= ic 0 mtu 1500 description: jail_haproxy options=3D8 ether 02:9b:93:8c:59:0a groups: epair media: Ethernet 10Gbase-T (10Gbase-T ) status: active nd6 options=3D29 jail.conf # Global settings applied to all jails. $domain =3D "test.nl"; exec.start =3D "/bin/sh /etc/rc"; exec.stop =3D "/bin/sh /etc/rc.shutdown"; exec.clean; mount.fstab =3D "/storage/jails/$name.fstab"; exec.system_user =3D "root"; exec.jail_user =3D "root"; mount.devfs; sysvshm=3D"new"; sysvsem=3D"new"; allow.raw_sockets; allow.set_hostname =3D 0; allow.sysvipc; enforce_statfs =3D "2"; devfs_ruleset =3D "11"; path =3D "/storage/jails/${name}"; host.hostname =3D "${name}.${domain}"; # Networking vnet; vnet.interface =3D "vnet0"; # Commands to run on host before jail is created exec.prestart =3D "ifconfig epair${ip} create up description jail_${name= }"; exec.prestart +=3D "ifconfig epair${ip}a up"; exec.prestart +=3D "ifconfig bridge0 addm epair${ip}a up"; exec.created =3D "ifconfig epair${ip}b name vnet0"; # Commands to run in jail after it is created exec.start +=3D "/bin/sh /etc/rc"; # commands to run in jail when jail is stopped exec.stop =3D "/bin/sh /etc/rc.shutdown"; # Commands to run on host when jail is stopped exec.poststop =3D "ifconfig bridge0 deletem epair${ip}a"; exec.poststop +=3D "ifconfig epair${ip}a destroy"; persist; web01 { $ip =3D 18; } haproxy { $ip =3D 20; mount.fstab =3D ""; path =3D "/storage/jails/${name}"; } pf.conf ####################################################################### ext_if=3D"vtnet0" table persist table persist table persist file "/usr/local/etc/pf/ssh-trusted" table persist file "/usr/local/etc/pf/custom-block" table { 10.233.185.0/24, 192.168.10.0/24 } icmp_types =3D "echoreq" junk_ports=3D"{ 135,137,138,139,445,68,67,3222,17500 }" # Log interface set loginterface $ext_if # Set limits set limit { states 40000, frags 20000, src-nodes 20000 } scrub on $ext_if all fragment reassemble no-df random-id # ---- Nat jails to the web binat on $ext_if from 10.233.185.15/32 to !10.233.185.0/24 -> 87.233.191.156/32 # saltmaste binat on $ext_if from 10.233.185.20/32 to !10.233.185.0/24 -> 87.233.191.155/32 # haproxy binat on $ext_if from 10.233.185.22/32 to !10.233.185.0/24 -> 87.233.191.154/32 # web-comb nat on $ext_if from to any -> ($ext_if:0) # ---- First rule obligatory "Pass all on loopback" pass quick on lo0 all pass quick on bridge0 all pass quick on bridge1 all # ---- Block TOR exit addresses block quick proto { tcp, udp } from to $ext_if # ---- Second rule "Block all in and pass all out" block in log all pass out all keep state # IPv6 pass in/out all IPv6 ICMP traffic pass in quick proto icmp6 all # Pass all lo0 set skip on lo0 ############### FIREWALL ############################################### # ---- Block custom ip's and logs block quick proto { tcp, udp } from to $ext_if # ---- Jail poorten pass in quick on { $ext_if } proto tcp from any to 10.233.185.22 port { smtp 80 443 993 995 1956 } keep state pass in quick on { $ext_if } proto tcp from any to 10.233.185.20 port { smtp 80 443 993 995 1956 } keep state pass in quick on { $ext_if } proto tcp from any to 10.233.185.15 port { 4505 4506 } keep state # ---- Allow ICMP pass in inet proto icmp all icmp-type $icmp_types keep state pass out inet proto icmp all icmp-type $icmp_types keep state pass in quick on $ext_if inet proto tcp from any to $ext_if port { 80, 443 } flags S/SA keep state pass in quick on $ext_if inet proto tcp from to $ext_if port { 4505 4506 } flags S/SA keep state block log quick from pass quick proto tcp from to $ext_if port ssh flags S/SA keep state This is as minimal i can get it. Hope this helps. regards, Johan Hendriks Op za 12 mrt. 2022 om 02:10 schreef Kristof Provost : > On 11 Mar 2022, at 18:55, Michael Gmelin wrote: > >> On 12. Mar 2022, at 01:21, Kristof Provost wrote: > >> > >> =EF=BB=BFOn 11 Mar 2022, at 17:44, Johan Hendriks wrote: > >>>> On 09/03/2022 20:55, Johan Hendriks wrote: > >>>> The problem: > >>>> I have a FreeBSD 14 machine and a FreeBSD 13-stable machine, both > running the same jails just to test the workings. > >>>> > >>>> The jails that are running are a salt master, a haproxy jail, 2 > webservers, 2 varnish servers, 2 php jails one for php8.0 and one with 8.= 1. > All the jails are connected to bridge0 and all the jails use vnet. > >>>> > >>>> I believe this worked on an older 14-HEAD machine, but i did not do = a > lot with it back then, and when i started testing again and after updatin= g > the OS i noticed that one of the varnish jails lost it's network connecti= on > after running for a few hours. I thought it was just something on HEAD so > never really looked at it. But later on when i start using the jails agai= n > and testing a test wordpress site i noticed that with a simple load test = my > haproxy jail within one minute looses it's network connection. I see > nothing in the logs, on the host and on the jail. > >>>> From the jail i can not ping the other jails or the IP adres of the > bridge. I can however ping the jails own IP adres. From the host i can al= so > not ping the haproxy jail IP adres. If i start a tcpdump on the epaira > interface from the haproxy jail i do see the packets arrive but not in th= e > jail. > >>>> > >>>> I used ZFS to send all the jails to a 13-STABLE machine and copied > over the jail.conf file as well as the pf.conf file and i saw the same > behavior. > >>>> > >>>> Then i tried to use 13.0-RELEASE-p7 and on that machine i do not see > this happening. There i can stress test the machine for 10 minutes withou= t > a problem but on 14-HEAD and 13-STABLE within a minute the jail's network > connection fails and only a restart of the jail brings it back online to > exhibit the same behavior if i start a simple load test which it should > handle nicely. > >>>> > >>>> One of the jail hosts is running under VMWARE and the other is > running under Ubuntu with KVM. The 13.0-RELEASE-p7 jail host is running > under Ubuntu with KVM > >>>> > >>>> Thank you for your time. > >>>> regards > >>>> Johan > >>>> > >>> I did some bisecting and the latest commit that works on FreeBSD > 13-Stable is 009a56b2e > >>> Then the commit 2e0bee4c7 if_epair: implement fanout and above is > showing the symptoms described above. > >>> > >> Interestingly I cannot reproduce stalls in simple epair setups. > >> It would be useful if you could reduce the setup with the problem into > a minimal configuration so we can figure out what other factors are > involved. > > > > If there are clear instructions on how to reproduce, I=E2=80=99m happy = to help > experimenting (I=E2=80=99m relying heavily on epair at this point). > > > > @Kristof: Did you try on bare metal or on vms? > > > Both. > > Kristof > --0000000000003efe9f05da0620e5 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
For me this minimal setup let me see the drop off of the n= etwork from the haproxy server.

2 jails, one with haproxy, one with = nginx which is using the following html file to be served.

<!DOCT= YPE html>
<html>
<head>
<title>Page Title<= /title>
</head>
<body>

<h1>My First Headi= ng</h1>
<p>My first paragraph.</p>

</body>= ;
</html>

From a remote machine i do a=C2=A0=C2=A0hey -h2 -= n 10 -c 10 -z 300s https://wp.test.nlThen a ping on the jailhost to the haproxy shows the following

[ /]= > ping 10.233.185.20
PING 10.233.185.20 (10.233.185.20): 56 data byt= es
64 bytes from 10.233.185.20: icm= p_seq=3D0 ttl=3D64 time=3D0.054 ms
64 bytes from 10.233.185.20: icmp_seq=3D1 ttl=3D64 time=3D0.050 ms
64 by= tes from 10.233.185.20: icmp_seq=3D2 t= tl=3D64 time=3D0.041 ms
<SNIP>
64 bytes from 10.233.185.20: icmp_seq=3D169 ttl=3D64 time=3D0.050 ms64 bytes from 10.233.185.20: icmp_se= q=3D170 ttl=3D64 time=3D0.154 ms
64 bytes from 10.233.185.20: icmp_seq=3D171 ttl=3D64 time=3D0.054 ms
64 by= tes from 10.233.185.20: icmp_seq=3D172= ttl=3D64 time=3D0.039 ms
64 bytes from 10.233.185.20: icmp_seq=3D173 ttl=3D64 time=3D0.160 ms
64 bytes fro= m 10.233.185.20: icmp_seq=3D174 ttl=3D= 64 time=3D0.045 ms
^C
--- 10.233.185.20 ping statistics ---
335 pa= ckets transmitted, 175 packets received, 47.8% packet loss
round-trip mi= n/avg/max/stddev =3D 0.037/0.070/0.251/0.040 ms


ifconfig
vtne= t0: flags=3D8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metr= ic 0 mtu 1500
options=3D4c00bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING= ,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,LINKSTATE,TXCSUM_IPV6>
ether 56:16= :e9:80:5e:41
inet 87.233.191.146 netmask 0xfffffff0 broadcast 87.233.19= 1.159
inet 87.233.191.156 netmask 0xffffffff broadcast 87.233.191.156 inet 87.233.191.155 netmask 0xffffffff broadcast 87.233.191.155
inet= 87.233.191.154 netmask 0xffffffff broadcast 87.233.191.154
media: Ethe= rnet autoselect (10Gbase-T <full-duplex>)
status: active
nd6 = options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
vtnet1: flags= =3D8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
= options=3D4c07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_H= WCSUM,TSO4,TSO6,LRO,VLAN_HWTSO,LINKSTATE,TXCSUM_IPV6>
ether 56:16:2c= :64:32:35
media: Ethernet autoselect (10Gbase-T <full-duplex>) status: active
nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOC= AL>
lo0: flags=3D8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 m= tu 16384
options=3D680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM= _IPV6>
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 sc= opeid 0x3
inet 127.0.0.1 netmask 0xff000000
groups: lo
nd6 opti= ons=3D21<PERFORMNUD,AUTO_LINKLOCAL>
bridge0: flags=3D8843<UP,BR= OADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
ether 58:9c:fc:= 10:ff:82
inet 10.233.185.1 netmask 0xffffff00 broadcast 10.233.185.255<= br> id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
maxage = 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
root id 00:00:00:00:0= 0:00 priority 32768 ifcost 0 port 0
member: epair20a flags=3D143<LEA= RNING,DISCOVER,AUTOEDGE,AUTOPTP>
=C2=A0 =C2=A0 =C2=A0 =C2=A0ifmaxad= dr 0 port 7 priority 128 path cost 2000
member: epair18a flags=3D143<= ;LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
=C2=A0 =C2=A0 =C2=A0 =C2=A0ifm= axaddr 0 port 15 priority 128 path cost 2000
groups: bridge
nd6 opt= ions=3D9<PERFORMNUD,IFDISABLED>
bridge1: flags=3D8843<UP,BROADC= AST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
ether 58:9c:fc:10:d= 9:1a
id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
ma= xage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
root id 00:00:00= :00:00:00 priority 32768 ifcost 0 port 0
member: vtnet0 flags=3D143<= LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
=C2=A0 =C2=A0 =C2=A0 =C2=A0ifma= xaddr 0 port 1 priority 128 path cost 2000
groups: bridge
nd6 optio= ns=3D9<PERFORMNUD,IFDISABLED>
pflog0: flags=3D141<UP,RUNNING,PR= OMISC> metric 0 mtu 33160
groups: pflog
epair18a: flags=3D8963<= ;UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
d= escription: jail_web01
options=3D8<VLAN_MTU>
ether 02:77:ea:1= 9:c7:0a
groups: epair
media: Ethernet 10Gbase-T (10Gbase-T <full= -duplex>)
status: active
nd6 options=3D29<PERFORMNUD,IFDISABL= ED,AUTO_LINKLOCAL>
epair20a: flags=3D8963<UP,BROADCAST,RUNNING,PRO= MISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
description: jail_haproxy<= br> options=3D8<VLAN_MTU>
ether 02:9b:93:8c:59:0a
groups: epa= ir
media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
status= : active
nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
jail.conf

# Global settings applied to all jails.
$domain = =3D "test.nl";

exec.start = =3D "/bin/sh /etc/rc";
exec.stop =3D "/bin/sh /etc/rc.shu= tdown";
exec.clean;

mount.fstab =3D "/storage/jails/$na= me.fstab";

exec.system_user =C2=A0=3D "root";
exec= .jail_user =C2=A0 =C2=A0=3D "root";
mount.devfs;
sysvshm=3D= "new";
sysvsem=3D"new";
allow.raw_sockets;
all= ow.set_hostname =3D 0;
allow.sysvipc;
enforce_statfs =3D "2"= ;;
devfs_ruleset =C2=A0 =C2=A0 =3D "11";

path =3D "= ;/storage/jails/${name}";
host.hostname =3D "${name}.${domain}= ";


# Networking
vnet;
vnet.interface =C2=A0 =C2=A0=3D= "vnet0";

=C2=A0 # Commands to run on host before jail is = created
=C2=A0 exec.prestart =C2=A0=3D "ifconfig epair${ip} create = up description jail_${name}";
=C2=A0 exec.prestart =C2=A0+=3D "= ;ifconfig epair${ip}a up";
=C2=A0 exec.prestart =C2=A0+=3D "if= config bridge0 addm epair${ip}a up";
=C2=A0 exec.created =C2=A0 =3D= "ifconfig epair${ip}b name vnet0";

=C2=A0 # Commands to r= un in jail after it is created
=C2=A0 exec.start =C2=A0+=3D "/bin/s= h /etc/rc";

=C2=A0 # commands to run in jail when jail is stopp= ed
=C2=A0 exec.stop =C2=A0=3D "/bin/sh /etc/rc.shutdown";
<= br>=C2=A0 # Commands to run on host when jail is stopped
=C2=A0 exec.pos= tstop =C2=A0=3D "ifconfig bridge0 deletem epair${ip}a";
=C2=A0= exec.poststop =C2=A0+=3D "ifconfig epair${ip}a destroy";
=C2= =A0 persist;

web01 {
=C2=A0 =C2=A0 $ip =3D 18;
}

haprox= y {
=C2=A0 =C2=A0 $ip =3D 20;
=C2=A0 =C2=A0 mount.fstab =3D "&qu= ot;;
=C2=A0 =C2=A0 path =3D "/storage/jails/${name}";
}
=
pf.conf

########################################################= ###############
ext_if=3D"vtnet0"
table <bruteforcers>= ; persist
table <torlist> persist
table <ssh-trusted> per= sist file "/usr/local/etc/pf/ssh-trusted"
table <custom-blo= ck> persist file "/usr/local/etc/pf/custom-block"
table <= ;jailnetworks> { 10.233.185.0/24,= 192.168.10.0/24 }

icmp_types= =3D "echoreq"
junk_ports=3D"{ 135,137,138,139,445,68,67,= 3222,17500 }"

# Log interface
set loginterface $ext_if
# Set limits
set limit { states 40000, frags 20000, src-nodes 20000 }<= br>
scrub on $ext_if all fragment reassemble no-df random-id

# --= -- Nat jails to the web
binat on $ext_if from 10.233.185.15/32 to !10.23= 3.185.0/24 -> 87.233.191.156/32= # saltmaste
binat on $ext_if from 10.233.185.20/32 to !10.233.185.0= /24 -> 87.233.191.155/32 # = haproxy
binat on $ext_if from 10.233= .185.22/32 to !10.233.185.0/24 -= > 87.233.191.154/32 # web-comb<= br>
nat on $ext_if from <jailnetworks> to any -> ($ext_if:0)
# ---- First rule obligatory "Pass all on loopback"
pass = quick on lo0 all
pass quick on bridge0 all
pass quick on bridge1 all<= br>
# ---- Block TOR exit addresses
block quick proto { tcp, udp } fr= om <torlist> to $ext_if

# ---- Second rule "Block all in = and pass all out"
block in log all
pass out all keep state
# IPv6 pass in/out all IPv6 ICMP traffic
pass in quick proto icmp6 all=

# Pass all lo0
set skip on lo0

############### FIREWALL #= ##############################################
# ---- Block custom ip= 9;s and logs
block quick proto { tcp, udp } from <custom-block> to= $ext_if

# ---- Jail poorten
pass in quick on { $ext_if } proto t= cp from any to 10.233.185.22 port { smtp 80 443 993 995 1956 } keep statepass in quick on { $ext_if } proto tcp from any to 10.233.185.20 port { s= mtp 80 443 993 995 1956 } keep state
pass in quick on { $ext_if } proto = tcp from any to 10.233.185.15 port { 4505 4506 } keep state

# ---- A= llow ICMP
pass in inet proto icmp all icmp-type $icmp_types keep statepass out inet proto icmp all icmp-type $icmp_types keep state

pass= in quick on $ext_if inet proto tcp from any to $ext_if port { 80, 443 } fl= ags S/SA keep state
pass in quick on $ext_if inet proto tcp from <ssh= -trusted> to $ext_if port { 4505 4506 } flags S/SA keep state
block l= og quick from <bruteforcers>
pass quick proto tcp from <ssh-tru= sted> to $ext_if port ssh flags S/SA keep state

This is as minima= l i can get it.

Hope this helps.
regards,
Johan Hendriks

Op za 12 mrt. 2022 om 02:10 schreef Kristof Provost <kp@freebsd.org>:
On 11 Mar 2022, at 18:55, Michael Gmelin wrote: >> On 12. Mar 2022, at 01:21, Kristof Provost <kp@freebsd.org> wrote:
>>
>> =EF=BB=BFOn 11 Mar 2022, at 17:44, Johan Hendriks wrote:
>>>> On 09/03/2022 20:55, Johan Hendriks wrote:
>>>> The problem:
>>>> I have a FreeBSD 14 machine and a FreeBSD 13-stable machin= e, both running the same jails just to test the workings.
>>>>
>>>> The jails that are running are a salt master, a haproxy=C2= =A0 jail, 2 webservers, 2 varnish servers, 2 php jails one for php8.0 and o= ne with 8.1. All the jails are connected to bridge0 and all the jails use v= net.
>>>>
>>>> I believe this worked on an older 14-HEAD machine, but i d= id not do a lot with it back then, and when i started testing again and aft= er updating the OS i noticed that one of the varnish jails lost it's ne= twork connection after running for a few hours. I thought it was just somet= hing on HEAD so never really looked at it. But later on when i start using = the jails again and testing a test wordpress site i noticed that with a sim= ple load test my haproxy jail within one minute looses it's network con= nection. I see nothing in the logs, on the host and on the jail.
>>>> From the jail i can not ping the other jails or the IP adr= es of the bridge. I can however ping the jails own IP adres. From the host = i can also not ping the haproxy jail IP adres. If i start a tcpdump on the = epaira interface from the haproxy jail i do see the packets arrive but not = in the jail.
>>>>
>>>> I used ZFS to send all the jails to a 13-STABLE machine an= d copied over the jail.conf file as well as the pf.conf file and i saw the = same behavior.
>>>>
>>>> Then i tried to use 13.0-RELEASE-p7 and on that machine i = do not see this happening. There i can stress test the machine for 10 minut= es without a problem but on 14-HEAD and 13-STABLE within a minute the jail&= #39;s network connection fails and only a restart of the jail brings it bac= k online to exhibit the same behavior if i start a simple load test which i= t should handle nicely.
>>>>
>>>> One of the jail hosts is running under VMWARE and the othe= r is running under Ubuntu with KVM. The 13.0-RELEASE-p7 jail host is runnin= g under Ubuntu with KVM
>>>>
>>>> Thank you for your time.
>>>> regards
>>>> Johan
>>>>
>>> I did some bisecting and the latest commit that works on FreeB= SD 13-Stable is 009a56b2e
>>> Then the commit 2e0bee4c7=C2=A0 if_epair: implement fanout and= above is showing the symptoms described above.
>>>
>> Interestingly I cannot reproduce stalls in simple epair setups. >> It would be useful if you could reduce the setup with the problem = into a minimal configuration so we can figure out what other factors are in= volved.
>
> If there are clear instructions on how to reproduce, I=E2=80=99m happy= to help experimenting (I=E2=80=99m relying heavily on epair at this point)= .
>
> @Kristof: Did you try on bare metal or on vms?
>
Both.

Kristof
--0000000000003efe9f05da0620e5--