macro benchmark for mutex locks needed.

Stephan Uphoff ups at tree.com
Tue Nov 23 12:00:01 PST 2004


On Tue, 2004-11-23 at 11:49, Robert Watson wrote:
> On Tue, 23 Nov 2004, Stephan Uphoff wrote:
> 
> > I have a bunch of ideas to speed up spin and mutex locks somewhat.  For
> > this I need benchmarks to test different modifications. 
> > 
> > While the micro-benchmark from rwatson@ is a good way to quickly test
> > modifications to weed out unlikely candidates - jhb@ tests have shown
> > that micro and macro-benchmarks do not always show the same result. 
> > 
> > Running benchmarks and booting takes a lot of time.  Since this is NOT
> > one my favorite tasks I want to run generally accepted benchmarks so I
> > can test (boot) each modification exactly once for each test machine. 
> > 
> > If you think I should run certain benchmarks with certain parameters
> > please tell me BEFORE I start testing! 
> 
> I like to use netblast from src/tools/tools/netrate/netblast.  It attempts
> to send packets as quickly as possible on a network interface, which is a
> CPU-intensive operation that is very sensitive to the cost of
> synchronization.  On an SMP system, it also generates a moderate ithread
> load as the gig-e interface transmits, and that ithread will often contend
> on the network interface driver lock with the running netblast thread.  As
> such, it changes that affect the cost and handling of contention are also
> visible in this benchmark.  With the synchronization micro-benchmark, I
> see spin locks on SMP being faster with the atomic release removed, but in
> the netblast test, I see those spinlocks as slower on SMP, since they
> behave less well under contention.
> 
> (The above with 64-bit if_em cards on a dual-Xeon).  Note that you'll want
> to make sure netreceive is running on a second box, or that you're sending
> to the broadcast address, or the icmp errors will substantially quench
> your send ability due to the asynchronouse report of the port closed.

Mhh...

My initial SMP test machine will be a Dell 1600SC dual-Xeon (P4 - 2.8
GHz/400MHz bus). It has a build in em Ethernet interface. Unfortunately
it is only a 82540EM / 32bit chip and it shares the PCI bus with a few
33MHz PCI cards :-(. The machine has an unused pci bus with free PCI-X
slot but I would need to order a server card.
What is you normal data rate with this test - any chance that the
82540EM will be sufficient?

The data sink will be a 32bit em card with an ancient slow P4 processor
using a cross-over cable.
Since this combination is probably not able to sink enough data I plan
to add a dummy static arp address for a dummy remote IP address to the
SMP machine. This should keep the the data sink's em card from actually
filling the receive buffers. Since this takes the pci bus and the slow
processor out of the equation this should be a perfect data sink -
right?




More information about the freebsd-arch mailing list