[Bug 263062] tcp_inpcb leaking in VM environment

From: <bugzilla-noreply_at_freebsd.org>
Date: Tue, 05 Apr 2022 14:17:25 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263062

            Bug ID: 263062
           Summary: tcp_inpcb leaking in VM environment
           Product: Base System
           Version: 13.1-STABLE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: eugene@zhegan.in

I'm running 13.0-RELEASE or 13.1-RC1 in a virtual machine attached to the outer
world via vtnet(4) driver. VM presumably is ran as Q35 chipset VM , definitely
under KVM/QEMU in Hetzner cloud datacenter.

VM is used as a web-server, proxying wss/grpc application via nginx with
relatively long-living connections. VM has 16 Gigs of memory and is running
GENERIC kernel. Nginx is servicing around 30-40K of established connections.

After 2-3 hours of uptime VM is starting to show several signs of kernel
structure leakage:


I can see multiple dmesg errors:

sonewconn: pcb 0xfffff8001ac8bd90: pru_attach() failed
sonewconn: pcb 0xfffff8000ab625d0: pru_attach() failed
sonewconn: pcb 0xfffff8000af999b0: pru_attach() failed
sonewconn: pcb 0xfffff8000ab621f0: pru_attach() failed
sonewconn: pcb 0xfffff8000ab62000: pru_attach() failed
sonewconn: pcb 0xfffff8000ab625d0: pru_attach() failed
sonewconn: pcb 0xfffff8000af999b0: pru_attach() failed
sonewconn: pcb 0xfffff8000af999b0: pru_attach() failed
sonewconn: pcb 0xfffff8000af993e0: pru_attach() failed
sonewconn: pcb 0xfffff8000af999b0: pru_attach() failed
sonewconn: pcb 0xfffff8000ab627c0: pru_attach() failed


console is spamming errors: 

[zone: tcp_inpcb] kern.ipc.maxsockets limit reached
[zone: tcp_inpcb] kern.ipc.maxsockets limit reached
[zone: tcp_inpcb] kern.ipc.maxsockets limit reached
[zone: tcp_inpcb] kern.ipc.maxsockets limit reached
[zone: tcp_inpcb] kern.ipc.maxsockets limit reached
[zone: tcp_inpcb] kern.ipc.maxsockets limit reached
[zone: tcp_inpcb] kern.ipc.maxsockets limit reached
[zone: tcp_inpcb] kern.ipc.maxsockets limit reached
[zone: tcp_inpcb] kern.ipc.maxsockets limit reached
[zone: tcp_inpcb] kern.ipc.maxsockets limit reached


and the network stack is basically unusable:


# telnet 127.0.0.1 4080
Trying 127.0.0.1...
telnet: socket: No buffer space available 

This is definitely caused by leaking tcp_inpcb. It's count is progressing over
time and is never constantly diminished:

(these are taken with 10 seconds interval)
ITEM                   SIZE  LIMIT     USED     FREE      REQ     FAILSLEEP
XDOMAIN
tcp_inpcb:              496, 4189440,  802462,    1194, 1050344,   0,   0,   0
ITEM                   SIZE  LIMIT     USED     FREE      REQ     FAILSLEEP
XDOMAIN
tcp_inpcb:              496, 4189440,  803971,    1469, 1051853,   0,   0,   0
ITEM                   SIZE  LIMIT     USED     FREE      REQ     FAILSLEEP
XDOMAIN
tcp_inpcb:              496, 4189440,  805375,    1081, 1053257,   0,   0,   0
ITEM                   SIZE  LIMIT     USED     FREE      REQ     FAILSLEEP
XDOMAIN
tcp_inpcb:              496, 4189440,  806936,    1296, 1054818,   0,   0,   0
ITEM                   SIZE  LIMIT     USED     FREE      REQ     FAILSLEEP
XDOMAIN
tcp_inpcb:              496, 4189440,  808609,    1143, 1056491,   0,   0,   0
ITEM                   SIZE  LIMIT     USED     FREE      REQ     FAILSLEEP
XDOMAIN
tcp_inpcb:              496, 4189440,  810052,    1228, 1057934,   0,   0,   0
ITEM                   SIZE  LIMIT     USED     FREE      REQ     FAILSLEEP
XDOMAIN
tcp_inpcb:              496, 4189440,  811487,     809, 1059369,   0,   0,   0
ITEM                   SIZE  LIMIT     USED     FREE      REQ     FAILSLEEP
XDOMAIN
tcp_inpcb:              496, 4189440,  813068,    1260, 1060950,   0,   0,   0
ITEM                   SIZE  LIMIT     USED     FREE      REQ     FAILSLEEP
XDOMAIN
tcp_inpcb:              496, 4189440,  814532,    1068, 1062414,   0,   0,   0
ITEM                   SIZE  LIMIT     USED     FREE      REQ     FAILSLEEP
XDOMAIN
tcp_inpcb:              496, 4189440,  816036,    1084, 1063918,   0,   0,   0
ITEM                   SIZE  LIMIT     USED     FREE      REQ     FAILSLEEP
XDOMAIN
tcp_inpcb:              496, 4189440,  817511,    1641, 1065393,   0,   0,   0
ITEM                   SIZE  LIMIT     USED     FREE      REQ     FAILSLEEP
XDOMAIN
tcp_inpcb:              496, 4189440,  818988,     924, 1066870,   0,   0,   0
ITEM                   SIZE  LIMIT     USED     FREE      REQ     FAILSLEEP
XDOMAIN
tcp_inpcb:              496, 4189440,  820412,    1532, 1068294,   0,   0,   0
ITEM                   SIZE  LIMIT     USED     FREE      REQ     FAILSLEEP
XDOMAIN
tcp_inpcb:              496, 4189440,  821880,     832, 1069762,   0,   0,   0
ITEM                   SIZE  LIMIT     USED     FREE      REQ     FAILSLEEP
XDOMAIN
tcp_inpcb:              496, 4189440,  823399,    1345, 1071281,   0,   0,   0
ITEM                   SIZE  LIMIT     USED     FREE      REQ     FAILSLEEP
XDOMAIN
tcp_inpcb:              496, 4189440,  824865,     895, 1072747,   0,   0,   0
ITEM                   SIZE  LIMIT     USED     FREE      REQ     FAILSLEEP
XDOMAIN
tcp_inpcb:              496, 4189440,  826309,    1227, 1074191,   0,   0,   0
ITEM                   SIZE  LIMIT     USED     FREE      REQ     FAILSLEEP
XDOMAIN
tcp_inpcb:              496, 4189440,  827594,     958, 1075476,   0,   0,   0

In the same time the kern.ipc.numopensockets is relatively low:

kern.ipc.numopensockets: 34689

I also have several 12.x and 13.x running the same stack on baremetal; but with
bigger amount of RAM: 96-128 Gigs. This never happens on these. One can say
that the reason is the amount of RAM, and this may seem reasonable. However:

- baremetal servers serve way more connections, for instance I have the
baremetal 13.0 server serving around 300K of connections:

TCP connection count by state:
4 connections in CLOSED state
65 connections in LISTEN state
31 connections in SYN_SENT state
446 connections in SYN_RCVD state
292378 connections in ESTABLISHED state
5 connections in CLOSE_WAIT state
27467 connections in FIN_WAIT_1 state
266 connections in CLOSING state
6714 connections in LAST_ACK state
5114 connections in FIN_WAIT_2 state
40976 connections in TIME_WAIT state

the number of ipc sockets is also way bigger:

kern.ipc.numopensockets: 332907

But the tcp_inpcb is way lower:

ITEM                   SIZE  LIMIT     USED     FREE      REQ FAIL SLEEP
tcp_inpcb:              488, 4189440,  374628,  181772,27330286910,   0,   0

So I assume this leakage is specific to the situation when FreeBSD runs in a
virtual environment, and is probably caused by the virtio drivers.

As a workaround I have tried to tweak some of the sysctl oids:

kern.maxfiles=4189440
kern.ipc.maxsockets=4189440
net.inet.tcp.tcbhashsize=1048576

but this measure only delayed the tcp_inpcb exhaustion.

-- 
You are receiving this mail because:
You are the assignee for the bug.