FreeBSD 5.2.1: Mutex/Spinlock starvation?
alex at metrocom.ru
Fri Jun 4 15:14:13 GMT 2004
I can't say anything as how the issue can be connected with the mutexes
and so on, but to solve your problem with apache, I'd look into
'hold_off_on_exponential_spawning' and 'MAX_SPAWN_RATE' parameters in
src/main/http_main.c of the apache source tree (presuming you're using
apache 1.3.*), and I'm sure some similar options can be found for apache
2.0. What you need is to make apache forking rate more slower, so the
server will not suffer from a sudden load peak.
Just my $0.02 :)
Alexander Varshavchick, Metrocom Joint Stock Company
Phone: (812)118-3322, 118-3115(fax)
On Thu, 3 Jun 2004, Ali Niknam wrote:
> Hi Guys,
> First of all: this is my first posting in this group so please be gentil :)
> The other day I was upgrading a system from FreeBSD 4.5 single CPU to
> FreeBSD 5.2.1 dual CPU and I came across a terrible problem.
> The system is used as a rather busy webserver, with continuesly about 1200
> apache processes, and about 200 mysql pthreads.
> The problem i ran into is that when apache starts it needs to create a lot
> of childs quickly. When it does so at a given time, after about a minute or
> so, a couple of childs go into "Giant" status mode. After a few seconds more
> and more processes go into Giant mode up until the point that the system
> will become totally unresponsive (even for keyboard innput). The only remedy
> is to disconnect the utp and wait a few seconds; then kill everything.
> Now the nice part is: this happens only if i set apache's maxclients > 1250.
> Under 1250 the same scenario happens but after a minute or so the system
> Now i unfortunately do not know enough about the internals of BSD to do a
> very estimated guess, but i'll give a shot nevertheless: my estimate is that
> due to the tremendous amount of 'locked' processes the system simply starves
> of CPU to do anything. My guess is the Locking mechanism probably uses
> some kind of 'spin' to wait until the resource is unlocked (whichever
> resource it is, probably something network related, though).
> This is based upon the fact that this does not happen if you slightly
> decrease the number of apache's; what happens in that case is that the same
> scenario goes on; however after a minute or so the system recovers!
> (probably because it has just enough CPU to handle everything as apache
> hits its limit?)
> Now if this is indeed the case i was thinking of something like a sysctl
> MUTEX_BLOCK_THRESHOLD set to something like 50. If the system detects that
> the number of processes locked is higher than this number, then it stops
> 'spinning' for resources, but instead uses a 'blocking' mechanism (simply
> puts the processes in a 'waiting' queue).
> I would be very interested to hear what this problem could be; perhaps i can
> test a little if someone has solutions (i cant test much unfortunately,
> it's a production system).
> Best Regards,
> Ali Niknam
> freebsd-hackers at freebsd.org mailing list
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe at freebsd.org"
More information about the freebsd-hackers