suspect problems on -current with pthread_cond_*()

Poul-Henning Kamp phk at
Wed Oct 6 18:13:58 UTC 2010

Hi Guys,

I updated my machine to current (9.0-CURRENT #0 r213377M: Mon Oct
4) (previous version from april sometime) and have started to see
weird new problems with Varnish regression tests.

It's pretty hard to get a trace on the problem, but from what I
have found out until now, it is related to the very first operation(s)
on a pthread_cond_t and the typical indication is a 100% cpu-spin
inside libthr.

I can reproduce the problem in approx 5 minutes by running the
automated Varnish regression tests in >=8 parallel streams repeatedly[1]
but due to the nature/complexity of varnish, I have not been able to
get a debugger to give me a useful backtrace yet.

I only use pthread_cond_t's in two isolated places and I am going to
muck about with them now, to see if I can affect the issue in any way
(higher/lower failure rate etc).

Any insights ?


PS: I'll arrive in Karlsruhe friday morning...

[1] It is an easy test to set up:

	svn co
	cd trunk/varnish-cache
	sh autogen.des
	cd varnish-cache/bin/varnishtest
	while gmake -j 12 -f Makefile.kristian check

	Look for test-failures with
		"HTTP rx failed (poll: Unknown error: 0)"

	A couple of the test cases may fail under high load
	for other reasons, in particular m00001.vtc and

	The varnishtest driver program can also be hit, but this
	happens much more seldom, that usually leaves a core dump
	with a useless backtrace.

