suspect problems on -current with pthread_cond_*()
Poul-Henning Kamp
phk at phk.freebsd.dk
Wed Oct 6 18:13:58 UTC 2010
Hi Guys,
I updated my machine to current (9.0-CURRENT #0 r213377M: Mon Oct
4) (previous version from april sometime) and have started to see
weird new problems with Varnish regression tests.
It's pretty hard to get a trace on the problem, but from what I
have found out until now, it is related to the very first operation(s)
on a pthread_cond_t and the typical indication is a 100% cpu-spin
inside libthr.
I can reproduce the problem in approx 5 minutes by running the
automated Varnish regression tests in >=8 parallel streams repeatedly[1]
but due to the nature/complexity of varnish, I have not been able to
get a debugger to give me a useful backtrace yet.
I only use pthread_cond_t's in two isolated places and I am going to
muck about with them now, to see if I can affect the issue in any way
(higher/lower failure rate etc).
Any insights ?
Poul-Henning
PS: I'll arrive in Karlsruhe friday morning...
[1] It is an easy test to set up:
svn co http://www.varnish-cache.org/svn/trunk
cd trunk/varnish-cache
sh autogen.des
make
cd varnish-cache/bin/varnishtest
while gmake -j 12 -f Makefile.kristian check
do
true
done
Look for test-failures with
"HTTP rx failed (poll: Unknown error: 0)"
A couple of the test cases may fail under high load
for other reasons, in particular m00001.vtc and
c00002.vtc.
The varnishtest driver program can also be hit, but this
happens much more seldom, that usually leaves a core dump
with a useless backtrace.
--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
More information about the freebsd-threads
mailing list