Re: vnet jails loose network connectivity
- In reply to: Johan Hendriks : "vnet jails loose network connectivity"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Mon, 07 Mar 2022 20:40:21 UTC
On 04/03/2022 15:36, Johan Hendriks wrote:
> Hello all, i use jails for some testing, but i can not seem to make it
> stable.
> I use vnet jails with a bridge but when i put some load on it, some
> jails loose there network connectivity.
>
> My setup is as follows, haproxy internal IP 10.233.185.20 using binat
> to make it Public accessable.
> Then a varnish jail, and two web servers al on the 10.233.185.x range.
>
> If i give it a little load with hey (hey -h2 -n 10 -c 20 -z 60s
> https://wp.test.nl) than within the test the haproxy jail is not
> reachable anymore it is not pingable from the host machine, and from
> the other jails. restarting the jails solves it, if i leave the system
> alone for some time i saw the varnish jail become unresponsive.
>
> If i do a tcpdump on the epair${name}a interface i do see the packages
> from the host machine to the jail but the jail itself is not reachable.
>
> There is nothing in the logs from the host and the jail itself, i can
> ping the jails ip adres from the jail itself.
>
>
> I do not think i have a special setup, but i could be doing something
> wrong.
> my jail.conf
>
> # Global settings applied to all jails.
> $domain = "test.nl";
> $subdomain = "";
>
> exec.start = "/bin/sh /etc/rc";
> exec.stop = "/bin/sh /etc/rc.shutdown";
> exec.clean;
>
> mount.fstab = "/storage/jails/$name.fstab";
>
> exec.system_user = "root";
> exec.jail_user = "root";
> mount.devfs;
> sysvshm="new";
> sysvsem="new";
> allow.raw_sockets;
> allow.set_hostname = 0;
> allow.sysvipc;
> enforce_statfs = "2";
> devfs_ruleset = "11";
>
> path = "/storage/jails/${name}";
> host.hostname = "${name}${subdomain}.${domain}";
>
> # Networking
> $uplinkdev = "vtnet1";
> $epid = "${ip}";
> $subnet = "10.233.185.";
> $cidr = "/24";
> $ipv4_addr = "${subnet}${ip}${cidr}";
> vnet;
> vnet.interface = "vnet0";
>
> $epair=epair${ip};
> vnet;
> #vnet.interface = "${epair}b"; # default vnet interface
> exec.prestart = "ifconfig bridge0 > /dev/null 2>&1 || ( ifconfig
> bridge0 create up && ifconfig bridge0 addm $uplinkdev )";
> exec.prestart += "ifconfig ${epair} create up description
> jail_${name} || echo 'Skipped creating epair (exists?)'";
> exec.prestart += "ifconfig bridge0 addm ${epair}a || echo
> 'Skipped adding bridge member (already member?)'";
> exec.created = "ifconfig ${epair}b name vnet0";
> exec.start = "/bin/sh /etc/rc";
> exec.consolelog = "/var/log/jail/$name.test.nl";
> exec.stop = "/bin/sh /etc/rc.shutdown";
> exec.poststop = "ifconfig bridge0 deletem ${epair}a";
> exec.poststop += "ifconfig ${epair}a destroy";
>
> varnish01 {
> $ip = 16;
> mount.fstab = "";
> path = "/storage/jails/${name}";
> }
>
> web01 {
> $ip = 18;
> }
>
> web02 {
> $ip = 19;
> }
>
> haproxy {
> $ip = 20;
> mount.fstab = "";
> path = "/storage/jails/${name}";
> }
>
> My ifconfig
>
> bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0
> mtu 1500
> ether 58:9c:fc:10:ff:82
> inet 10.233.185.1 netmask 0xffffff00 broadcast 10.233.185.255
> id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
> maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
> root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
> member: epair20a flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
> ifmaxaddr 0 port 13 priority 128 path cost 2000
> member: epair19a flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
> ifmaxaddr 0 port 53 priority 128 path cost 2000
> member: epair18a flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
> ifmaxaddr 0 port 48 priority 128 path cost 2000
> member: epair16a flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
> ifmaxaddr 0 port 28 priority 128 path cost 2000
> groups: bridge
> nd6 options=9<PERFORMNUD,IFDISABLED>
> epair16a: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST>
> metric 0 mtu 1500
> description: jail_varnish01
> options=8<VLAN_MTU>
> ether 02:76:32:8e:0e:0a
> groups: epair
> media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
> status: active
> nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
> epair18a: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST>
> metric 0 mtu 1500
> description: jail_web01
> options=8<VLAN_MTU>
> ether 02:6d:be:b8:36:0a
> groups: epair
> media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
> status: active
> nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
> epair19a: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST>
> metric 0 mtu 1500
> description: jail_web02
> options=8<VLAN_MTU>
> ether 02:54:fd:77:9a:0a
> groups: epair
> media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
> status: active
> nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
> epair20a: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST>
> metric 0 mtu 1500
> description: jail_haproxy
> options=8<VLAN_MTU>
> ether 02:f8:58:06:78:0a
> groups: epair
> media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
> status: active
> nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>
> This is on both 13-STABLE and 14-HEAD.
>
>
For the sake of testing i tried it with FreeBSD 13.0-RELEASE-p7 and this
works fine. This is an exact copy of the setup i use on 14-CURRENT and
13-STABLE. (i did a ZFS send and receive of the jails and a copy of the
jail.conf. pf.conf and so on) I did run the hey command targeting the
13-0-RELEASE multiple times.
hey -h2 -n 10 -c 30 -z 300s https://wp.test.nl
Summary:
Total: 300.0045 secs
Slowest: 0.1137 secs
Fastest: 0.0006 secs
Average: 0.0090 secs
Requests/sec: 4627.4504
Response time histogram:
0.001 [1] |
0.012 [977291] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
0.023 [21236] |■
0.035 [1125] |
0.046 [230] |
0.057 [12] |
0.068 [18] |
0.080 [9] |
0.091 [18] |
0.102 [30] |
0.114 [30] |
Latency distribution:
10% in 0.0037 secs
25% in 0.0046 secs
50% in 0.0061 secs
75% in 0.0080 secs
90% in 0.0096 secs
95% in 0.0106 secs
99% in 0.0133 secs
Details (average, fastest, slowest):
DNS+dialup: 0.0000 secs, 0.0006 secs, 0.1137 secs
DNS-lookup: 0.0000 secs, 0.0000 secs, 0.0028 secs
req write: 0.0001 secs, 0.0000 secs, 0.1126 secs
resp wait: 0.0192 secs, 0.0000 secs, 214.9645 secs
resp read: 0.0018 secs, 0.0002 secs, 0.1076 secs
Status code distribution:
[200] 1000000 responses
All is fine on the 13.0-RELEASE-p7 also with a higher concurrency,
however if i do it against the 14-CURRENT or the 13-STABLE, even a run
of 60 seconds kills the network connectivity of the jail. (haproxy in my
case)
regards,
Johan