Passenger hangs on live and SEGV on tests possible threading /
kernel bug?
John Baldwin
jhb at freebsd.org
Mon Dec 21 13:35:59 UTC 2009
On Thursday 17 December 2009 12:27:17 pm Steven Hartland wrote:
> ----- Original Message -----
> From: "John Baldwin" <jhb at freebsd.org>
> > For the hang it seems you have a thread waiting in a blocking read(), a thread
> > waiting in a blocking accept(), and lots of threads creating condition
> > variables. However, the pthread_cond_init() in libpthread (libthr on FreeBSD)
> > doesn't call pthread_cleanup_push(), so your stack trace doesn't make sense to
> > me. However, that may be gdb getting confused. The pthread_cleanup_push()
> > frame may be cond_init(). However, it doesn't call umtx_op() (the
> > _thr_umutex_init() call it makes just initializes the structure, it doesn't
> > make a _umtx_op() system call). You might try posting on threads@ to try to
> > get more info on this, but your pthread_cond_init() stack traces don't really
> > make sense. Can you rebuild libc and libthr with debug symbols?
> >
> > For example:
> >
> > # cd /usr/src/lib/libc
> > # make clean
> > # make DEBUG_FLAGS=-g
> > # make DEBUG_FLAGS=-g install
> >
> > However, if you are hanging in read(), that usually means you have a socket
> > that just doesn't have data. That might be an application bug of some sort.
> >
> > The segv trace doesn't include the first part of GDB messages which show which
> > thread actually had a seg fault. It looks like it was the thread that was
> > throwing an exception. However, nanosleep() doesn't throw exceptions, so that
> > stack trace doesn't really make sense either. Perhaps that stack is hosed by
> > the exception handling code?
>
> I've uploaded a two more traces for the oxt test failure / segv.
> http://code.google.com/p/phusion-passenger/issues/detail?id=441#c1
>
> >From looking at the test case it testing the capture of failures and its ability
> to create a stack trace output so that may give others some indication where
> the issue may be?
>
> I will look to do the same on for the hang issue but that's on a live site so
> will need to schedule some downtime before I can get those rebuilt and then
> wait for it to hang again, which could be quite some time :(
Hmmm, the only seg fault I see is happening down inside libgcc in the stack
unwinding code and that is 3rd party code from gcc.
--
John Baldwin
More information about the freebsd-stable
mailing list