Better "hash_packet6"

Luigi Rizzo rizzo at icir.org
Wed Dec 6 01:29:38 PST 2006


On Wed, Dec 06, 2006 at 04:51:51AM +0100, Max Laier wrote:
> On Wednesday 06 December 2006 01:17, Luigi Rizzo wrote:
...
> > First, this proposal, with 36 multiplies and one division, the
> > function seems rather expensive for e.g. a low end cpu (arm or
> > soekris) as you might find on network-appliance boxes.
> > Any chance to get performance numbers on that hardware ?
> 
> I tried the reference machines (see hacked up attachment):
> 78x ia64
> 40x amd64
> 60x p3
> 16x p4

i assume the first number is the slowdown between the current and
the proposed method ?
I have slightly modified/extended the program adding the hsieh hash
that i mentioned below, and made it easy to add more methods. the
code is at

	http://info.iet.unipi.it/~luigi/hc.c

as expected, especially the simplest algorithms depend a lot
on cache effects so if you change the number of packets or the number
of loops things vary a lot. In any case, on a soekris (the default
parameters are too high):

	# ./hc 1000 100
	starting algorithm hash_pkt5 1000 loops 100 packets
	took 87862 usec, 0.878620 per cycle
	starting algorithm hash_hsieh 1000 loops 100 packets
	took 1082883 usec, 10.828830 per cycle
	starting algorithm hash_pkt6 1000 loops 100 packets
	took 2697178 usec, 26.971780 per cycle
	# ./hc 100 10000
	starting algorithm hash_pkt5 100 loops 10000 packets
	took 2023610 usec, 2.023610 per cycle
	starting algorithm hash_hsieh 100 loops 10000 packets
	took 11619238 usec, 11.619238 per cycle
	starting algorithm hash_pkt6 100 loops 10000 packets
	took 27739595 usec, 27.739595 per cycle
(here we probably overflow some cache so the simple algorithm
suffers a lot by increasing the number of different packets)

on my new 3 GHz pentium D

	> ./hc 1000 100
	starting algorithm hash_pkt5 1000 loops 100 packets
	took 1258 usec, 0.012580 per cycle
	starting algorithm hash_hsieh 1000 loops 100 packets
	took 16152 usec, 0.161520 per cycle
	starting algorithm hash_pkt6 1000 loops 100 packets
	took 25485 usec, 0.254850 per cycle
	> ./hc 100 10000
	starting algorithm hash_pkt5 100 loops 10000 packets
	took 12870 usec, 0.012870 per cycle
	starting algorithm hash_hsieh 100 loops 10000 packets
	took 162510 usec, 0.162510 per cycle
	starting algorithm hash_pkt6 100 loops 10000 packets
	took 248003 usec, 0.248003 per cycle

Surely we need to experiment a bit more, but the cost
is significant especially on low end boxes.
Maybe we could restrict the hash to just a part of the address ?

	cheers
	luigi


More information about the freebsd-ipfw mailing list