panics in soabort with so_count != 0, one possible solution to one cause.

Wed Jan 9 19:42:04 UTC 2013

On 09.01.2013 18:12, Gleb Smirnoff wrote:
> On Wed, Jan 09, 2013 at 04:22:15PM +0100, Steve Read wrote:
> S> Context for this message:
> S> http://www.freebsd.org/cgi/query-pr.cgi?pr=145825&cat=kern
> S> <http://www.freebsd.org/cgi/query-pr.cgi?pr=145825&cat=kern>
> S> kern/145825: [panic] panic: soabort: so_count
> S>
> S> AND
> S>
> S> http://www.freebsd.org/cgi/query-pr.cgi?pr=159621
> S> kern/159621: [tcp] [panic] panic: soabort: so_count
> S>
> S> The two PRs are essentially reporting the same thing, and I have seen
> S> evidence of people reporting this panic against kernels as old as 6.2.
> S>
> S> == Scenario ==
> S> The basic scenario is:
> S> 1. There is a local listening TCP socket.  A userland thread is waiting
> S> on a kqueue, and will eventually call accept() on this socket.
> S> 2. A new TCP connection arrives that matches this TCP socket.  Syncache
> S> hangs on to the connection until the three-way handshake is complete
> S> (i.e. the ACK arrives).
> S> 3. At this point, syncache_socket() calls sonewconn() and passes
> S> SS_ISCONNECTED.  sonewconn() as a result hands the new socket off to the
> S> accept queue and wakes up the userland thread (marks the listening
> S> socket "readable", sends a kqueue notification, etc.).
> S> 4. Something goes wrong during the rest of syncache_socket(), as a
> S> result of which it calls soabort().
> S>
> S> == Consequence ==
> S> On a single-CPU machine, the netisr thread that called syncache_socket()
> S> blocks out the userland thread until it has finished, so so_count of the
> S> new connected socket is still zero when syncache_socket() calls
> S> soabort().  (It's not absolutely guaranteed, as there are calls to
> S> locking functions along the way, but it usually happens.)
> S>
> S> On a multi-CPU machine of any sort, the userland thread resumes
> S> immediately that it is woken up, and it is possible (but not guaranteed)
> S> for it to grab the socket and increment its so_count before
> S> syncache_socket() calls soabort().
> S>
> S> I have a core which shows the netisr thread hitting the panic in
> S> soabort(), while the expected userland thread (on a different CPU) is
> S> still in the kernel, churning through the post-pickup part of accept().
> S>
> S> == Proposed solution ==
> S> My proposed solution to this issue is:
> S> 1. Replace SS_ISCONNECTED with 0 in the call to sonewconn() to prevent
> S> it from waking up the listening thread.
> S> 2. At the "end" of syncache_socket(), call soisconnected(), passing the
> S> new socket.  This will issue the wakeup after syncache_socket() has
> S> finished preparing itself, and in particular after the last possible
> S> call to soabort().
> S>
> S> I'm concerned, of course, that this may cause some unobvious fallout
> S> somewhere, but I can't see for the moment what it would be.  Any advice
> S> would be welcome.
> S>
> S> == Patch that applies the proposed solution ==
> S> A patch that would apply to kernel 8.3 (the basic scenario appears to
> S> still be feasible with HEAD, and the code is very similar):
> S>
> S> ======
> S> --- netinet/tcp_syncache.c.orig    2013-01-09 13:18:05.000000000 +0000
> S> +++ netinet/tcp_syncache.c    2013-01-09 14:03:54.000000000 +0000
> S> @@ -638,7 +638,7 @@
> S>        * connection when the SYN arrived.  If we can't create
> S>        * the connection, abort it.
> S>        */
> S> -    so = sonewconn(lso, SS_ISCONNECTED);
> S> +    so = sonewconn(lso, 0);
> S>       if (so == NULL) {
> S>           /*
> S>            * Drop the connection; we will either send a RST or
> S> @@ -831,6 +831,8 @@
> S>
> S>       INP_WUNLOCK(inp);
> S>
> S> +    soisconnected(so);
> S> +
> S>       TCPSTAT_INC(tcps_accepts);
> S>       return (so);
>
> AFAIU, in head this race was fixed by r243627. Can Vijay and Andre comment?

No, this is a different problem.  Here it is a race with the newly
created socket that is not fully set up yet.  The socket structure
itself is set up but the INPCB/TCPCB is not yet finished.  The new
socket is not locked but already placed on the accept queue as fully
set up.  From there is may get picked up by userspace already.  There
are a numner of things that can go wrong in syncache_socket() like
insertion into the INPCBHASH.  External influence by way of RST or
such is not a problem and not part of the problem I think.  When
syncache_socket() decides to abort the socket it calls soabort(),
which is an internal quick cleanup that assumes that no consumers
of the socket exist yet.  In this case this isn't true as another
core has already snatched up this socket and delivered to userspace.

The locking model on accept sockets unfortunately is not fully refined
yet.

There are three possible ways to fix this as I can see at the moment:
  1. the proposed patch going through the incomplete accept queue first;
  2. lock the newly created socket;
  3. defer adding the socket to the complete accept queue without going
     through the incomplete queue for a few cycles.

The first one is the simplest and carries the lowest risk at the expense
of a bit of unnecessary work.  The second one requires an extension of
the current locking model (or lack thereof in this area) and needs more
work to get everything right, also for other protocols.  sonewconn() would
then return a locked socket.  Current callers do not expect that.  The
third one would add a special case to sonewconn() at the callers request
foregoing the incomplete queue.

For now I'd say the first option, the proposed patch, is the way to go
as it has the lowest risk with only little overhead.  And it is easy to
MFC.

-- 
Andre