cvs commit: src/sys/i386/i386 pmap.c
ups at tree.com
Tue Nov 9 10:21:11 PST 2004
On Tue, 2004-11-09 at 13:02, Julian Elischer wrote:
> Robert Watson wrote:
> >This change made a large difference, and eliminates the unexplained costs.
> >Here's a revised table as compared to the above:
> > sleep mutex crit section spin mutex new spin mutex
> > UP SMP UP SMP UP SMP UP SMP
> >PIII 21 81 83 81 112 141 95 141
> >P4 39 260 120 119 274 342 132 231
> >So it basically cut 140 cycles off the P4 UP spin lock, 15 off the PIII UP
> >spin lock, and 110 cycles off the P4 SMP spin lock. The PIII SMP spin
> >lock looks the same. Keep in mind that all of these measurements have a
> >standard deviation of between 0 and 3 cycles, most in the 1 range. Also
> >keep in mind that these are entirely uncontended measurements.
> >Assuming that these changes are correct, and pass whatever tests people
> >have in mind, this would be a very strong merge candidate for performance
> >reasons. The difference is visible in packet send tests from user space
> >as a percentage or two improvement on UP on my P4, although it's a litte
> >hard to tell due to the noise.
> Can you explain why a spin mutex is more expensive than a sleep mutex (I
> assume this is uncontested)?
cli() and sti() used for the critical section are expensive.
( The spin mutex includes the critical section)
I recall a USENIX paper about avoiding the cost of cli(),sti() by just
setting an in memory flag. The interrupt handler was modified to honor
the flag and delay interrupt processing until the flag was cleared.
This may have the potential to drastically decrease the cost of a spin
mutex if interrupts during critical regions are infrequent.
More information about the cvs-all