cvs commit: src/sys/i386/i386 pmap.c

Stephan Uphoff ups at tree.com
Tue Nov 9 10:21:11 PST 2004


On Tue, 2004-11-09 at 13:02, Julian Elischer wrote:
> Robert Watson wrote:
> 
> >This change made a large difference, and eliminates the unexplained costs.
> >Here's a revised table as compared to the above:
> >
> >	sleep mutex	crit section	spin mutex	new spin mutex
> >	UP	SMP	UP	SMP	UP	SMP	UP	SMP
> >PIII	21	81	83	81	112	141	95	141
> >P4	39	260	120	119	274	342	132	231
> >
> >So it basically cut 140 cycles off the P4 UP spin lock, 15 off the PIII UP
> >spin lock, and 110 cycles off the P4 SMP spin lock.  The PIII SMP spin
> >lock looks the same.  Keep in mind that all of these measurements have a
> >standard deviation of between 0 and 3 cycles, most in the 1 range.  Also
> >keep in mind that these are entirely uncontended measurements.
> >
> >Assuming that these changes are correct, and pass whatever tests people
> >have in mind, this would be a very strong merge candidate for performance
> >reasons.  The difference is visible in packet send tests from user space
> >as a percentage or two improvement on UP on my P4, although it's a litte
> >hard to tell due to the noise. 
> >  
> >
> Can you explain why a spin mutex is more expensive than a sleep mutex (I 
> assume this is uncontested)?

cli() and sti() used for the critical section are expensive.
( The spin mutex includes the critical section)

I recall a USENIX paper about avoiding the cost of cli(),sti() by just
setting an in memory flag. The interrupt handler was modified to honor
the flag and delay interrupt processing until the flag was cleared.
This may have the potential to drastically decrease the cost of a spin
mutex if interrupts during critical regions are infrequent.

	Stephan



More information about the cvs-all mailing list