[Bug 289734] panic tcp_usr_close while running mount command after configure NFS over TLS

From: <bugzilla-noreply_at_freebsd.org>
Date: Tue, 28 Oct 2025 20:42:57 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=289734

--- Comment #18 from Rick Macklem <rmacklem@FreeBSD.org> ---
(In reply to Gleb Smirnoff from comment #17)
I haven't found time to look at the code, but
here is what the old (FreeBSD-14) code does:

(A) - When the krpc receives a "needs a TLS handshake"
      request (a Null RPC with "STARTTLS" stuffed in it),
      the krpc does an upcall to the userland daemon (rpc.tlsservd).

(B) - The userland daemon (rpc.tlsservd) does a syscall that says
      "I need a file descriptor for the socket".
      The krpc cobbles a file descriptor for the daemon for the socket.
***   At this point the krpc marks the socket (closed by daemon and
      not soclose() here in the kernel) and returns the file descriptor
      to the daemon.
(C) - The daemon sets the SSL library to use the socket file
      descriptor, notes that it is responsible for doing a close(s)
      on the socket and calls SSL_accept() to do the actual handshake.
(D) - After SSL_accept() returns, it replies to the upcall done at
      (A) with the results of the TLS handshake.

Note that (B) at "***" is the exact point at which responsibility
for closing the socket is given to the daemon (rpc.tlsservd).

My understanding is that the glebius@ patch got rid of (B)
and my hunch is there is now a time window between (A) and (D)
where both the daemon (rpc.tlsservd) and the krpc might do a
[so]close() on the socket.

The easy way for me to fix this (since I am not familiar with
glebius@'s code) is to go back to the FreeBSD-14 code and make
the minimal changes needed for it to use netlink for the upcall
instead of an AF_LOCAL socket (which was what I understand was
the original goal of the glebius@ patch).
--> In other words, return it to using the syscall at (B) and
    using separate daemon processes (with a TCP connection pinned
    to one of them) instead of pthreads.

If glebius@ is ok with doing this, I can do so fairly quickly
and come with a patch for testing.

-- 
You are receiving this mail because:
You are the assignee for the bug.