[BETA7-panic] sodealloc(): so_count 1
Vlad
marchenko at gmail.com
Thu Oct 7 12:11:53 PDT 2004
Brian,
thanks for your work. Question: is this patch is something I can feel
comfortable about trying it on a production server? Did you test it?
On Thu, 7 Oct 2004 14:18:52 -0400, Brian Fundakowski Feldman
<green at freebsd.org> wrote:
>
>
> On Thu, Oct 07, 2004 at 01:00:08AM +0200, Marc UBM Bocklet wrote:
> > On Wed, 6 Oct 2004 17:51:34 -0400
> > Brian Fundakowski Feldman <green at freebsd.org> wrote:
> >
> > > On Wed, Oct 06, 2004 at 03:55:21PM -0400, Vlad wrote:
> > > > Brian,
> > > >
> > > > I've created ticket a while ago in regards to this problem:
> > > >
> > > > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/72126
> > > >
> > > > also, attached are some additional debug info I could get during
> > > > last two times it crashed.
> > > >
> > > > unfortunately I don't have big enough stand-alone swap partition to
> > > > get kernel crash dump, so those files are the best I could get.
> > > >
> > > > here is week-old thread for the same problem:
> > > > http://groups.google.com/groups?th=fc0d7d881f0713cc
> > >
> > > Can you tell me where sorele() crashed when you did _not_ have
> > > INVARIANTS enabled? That will help because the INVARIANTS panic tells
> > > us one half of where the problem is and the sorele() panic is the
> > > other half. Want to try this (untested) patch for the problem that I'm
> > > guessing about?
> >
> > [possible patch for sodealloc panic]
> >
> > Ok, I'm just now recompiling my kernel with your patch, I should be able
> > to boot the kernel tomorrow, then we'll see if there are any further
> > panics.
> >
> > Thanks a lot for your help! :-)
>
> You're welcome! If that does work (or mostly works), please try these
> deltas instead. I have only tried to make TCP work correctly in this
> respect, but all protocols need to be able to atomically with regard
> to both the accept lock and the socket lock give up their pcb from
> the proto bottom half and give final "ownership" of the socket to its
> parent (listening socket) before never touching it again.
>
> The order of SOCK_LOCK(), so_pcb = NULL, check for 0 references, check
> for no fd reference, SOCK_UNLOCK(), ACCEPT_LOCK() violates this. This
> is mostly something that could easily be fixed by using regular GC
> methods to do the final bit of deallocation.
>
> Index: sys/socketvar.h
> ===================================================================
> RCS file: /usr/ncvs/src/sys/sys/socketvar.h,v
> retrieving revision 1.133
> diff -u -r1.133 socketvar.h
> --- sys/socketvar.h 12 Jul 2004 21:42:33 -0000 1.133
> +++ sys/socketvar.h 7 Oct 2004 17:52:49 -0000
> @@ -205,6 +205,7 @@
> #define SS_ASYNC 0x0200 /* async i/o notify */
> #define SS_ISCONFIRMING 0x0400 /* deciding to accept connection req */
> #define SS_ISDISCONNECTED 0x2000 /* socket disconnected from peer */
> +#define SS_HASBEENCOMP 0x4000 /* proto made socket SQ_COMP */
>
> /*
> * Socket state bits now stored in the socket buffer state field.
> @@ -360,6 +361,16 @@
> SOCK_UNLOCK(so); \
> } while(0)
>
> +#define sopcbfreed(so) do { \
> + SOCK_LOCK_ASSERT(so); \
> + (so)->so_pcb = NULL; \
> + if ((so)->so_count == 0 && \
> + ((so)->so_state & SS_HASBEENCOMP) == 0) \
> + sofree(so); \
> + else \
> + SOCK_UNLOCK(so); \
> +} while(0)
> +
> /*
> * In sorwakeup() and sowwakeup(), acquire the socket buffer lock to
> * avoid a non-atomic test-and-wakeup. However, sowakeup is
> Index: kern/uipc_socket2.c
> ===================================================================
> RCS file: /usr/ncvs/src/sys/kern/uipc_socket2.c,v
> retrieving revision 1.137
> diff -u -r1.137 uipc_socket2.c
> --- kern/uipc_socket2.c 15 Aug 2004 06:24:41 -0000 1.137
> +++ kern/uipc_socket2.c 7 Oct 2004 17:42:15 -0000
> @@ -117,12 +117,15 @@
> {
> struct socket *head;
>
> + ACCEPT_LOCK();
> SOCK_LOCK(so);
> so->so_state &= ~(SS_ISCONNECTING|SS_ISDISCONNECTING|SS_ISCONFIRMING);
> so->so_state |= SS_ISCONNECTED;
> - SOCK_UNLOCK(so);
> - ACCEPT_LOCK();
> head = so->so_head;
> + if (head && (so->so_qstate & SQ_INCOMP) &&
> + (so->so_options & SO_ACCEPTFILTER) == 0)
> + so->so_state |= SS_HASBEENCOMP;
> + SOCK_UNLOCK(so);
> if (head != NULL && (so->so_qstate & SQ_INCOMP)) {
> if ((so->so_options & SO_ACCEPTFILTER) == 0) {
> TAILQ_REMOVE(&head->so_incomp, so, so_list);
> Index: kern/uipc_syscalls.c
> ===================================================================
> RCS file: /usr/ncvs/src/sys/kern/uipc_syscalls.c,v
> retrieving revision 1.200
> diff -u -r1.200 uipc_syscalls.c
> --- kern/uipc_syscalls.c 15 Aug 2004 06:24:41 -0000 1.200
> +++ kern/uipc_syscalls.c 7 Oct 2004 17:55:22 -0000
> @@ -314,9 +314,8 @@
> KASSERT(so->so_qstate & SQ_COMP, ("accept1: so not SQ_COMP"));
>
> /*
> - * Before changing the flags on the socket, we have to bump the
> - * reference count. Otherwise, if the protocol calls sofree(),
> - * the socket will be released due to a zero refcount.
> + * XXX This reference should really be done by soaccept(). Once
> + * SQ_COMP, a socket will never disappear from underneath us.
> */
> SOCK_LOCK(so);
> soref(so); /* file descriptor reference */
> cvs diff: Diffing netinet
> Index: netinet/in_pcb.c
> ===================================================================
> RCS file: /usr/ncvs/src/sys/netinet/in_pcb.c,v
> retrieving revision 1.155
> diff -u -r1.155 in_pcb.c
> --- netinet/in_pcb.c 29 Sep 2004 04:01:13 -0000 1.155
> +++ netinet/in_pcb.c 7 Oct 2004 17:58:38 -0000
> @@ -688,8 +688,7 @@
> in_pcbremlists(inp);
> if (so) {
> SOCK_LOCK(so);
> - so->so_pcb = NULL;
> - sotryfree(so);
> + sopcbfreed(so);
> }
> if (inp->inp_options)
> (void)m_free(inp->inp_options);
> Index: netinet/tcp_subr.c
> ===================================================================
> RCS file: /usr/ncvs/src/sys/netinet/tcp_subr.c,v
> retrieving revision 1.203
> diff -u -r1.203 tcp_subr.c
> --- netinet/tcp_subr.c 5 Sep 2004 02:34:12 -0000 1.203
> +++ netinet/tcp_subr.c 7 Oct 2004 17:59:01 -0000
> @@ -1686,10 +1686,9 @@
> tcp_discardcb(tp);
> so = inp->inp_socket;
> SOCK_LOCK(so);
> - so->so_pcb = NULL;
> tw->tw_cred = crhold(so->so_cred);
> tw->tw_so_options = so->so_options;
> - sotryfree(so);
> + sopcbfreed(so);
> inp->inp_socket = NULL;
> if (acknow)
> tcp_twrespond(tw, TH_ACK);
>
>
>
> --
> Brian Fundakowski Feldman \'[ FreeBSD ]''''''''''\
> <> green at FreeBSD.org \ The Power to Serve! \
> Opinions expressed are my own. \,,,,,,,,,,,,,,,,,,,,,,\
> _______________________________________________
> freebsd-current at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe at freebsd.org"
>
--
Vlad
More information about the freebsd-current
mailing list