Better "hash_packet6"
Luigi Rizzo
rizzo at icir.org
Wed Dec 6 01:29:38 PST 2006
On Wed, Dec 06, 2006 at 04:51:51AM +0100, Max Laier wrote:
> On Wednesday 06 December 2006 01:17, Luigi Rizzo wrote:
...
> > First, this proposal, with 36 multiplies and one division, the
> > function seems rather expensive for e.g. a low end cpu (arm or
> > soekris) as you might find on network-appliance boxes.
> > Any chance to get performance numbers on that hardware ?
>
> I tried the reference machines (see hacked up attachment):
> 78x ia64
> 40x amd64
> 60x p3
> 16x p4
i assume the first number is the slowdown between the current and
the proposed method ?
I have slightly modified/extended the program adding the hsieh hash
that i mentioned below, and made it easy to add more methods. the
code is at
http://info.iet.unipi.it/~luigi/hc.c
as expected, especially the simplest algorithms depend a lot
on cache effects so if you change the number of packets or the number
of loops things vary a lot. In any case, on a soekris (the default
parameters are too high):
# ./hc 1000 100
starting algorithm hash_pkt5 1000 loops 100 packets
took 87862 usec, 0.878620 per cycle
starting algorithm hash_hsieh 1000 loops 100 packets
took 1082883 usec, 10.828830 per cycle
starting algorithm hash_pkt6 1000 loops 100 packets
took 2697178 usec, 26.971780 per cycle
# ./hc 100 10000
starting algorithm hash_pkt5 100 loops 10000 packets
took 2023610 usec, 2.023610 per cycle
starting algorithm hash_hsieh 100 loops 10000 packets
took 11619238 usec, 11.619238 per cycle
starting algorithm hash_pkt6 100 loops 10000 packets
took 27739595 usec, 27.739595 per cycle
(here we probably overflow some cache so the simple algorithm
suffers a lot by increasing the number of different packets)
on my new 3 GHz pentium D
> ./hc 1000 100
starting algorithm hash_pkt5 1000 loops 100 packets
took 1258 usec, 0.012580 per cycle
starting algorithm hash_hsieh 1000 loops 100 packets
took 16152 usec, 0.161520 per cycle
starting algorithm hash_pkt6 1000 loops 100 packets
took 25485 usec, 0.254850 per cycle
> ./hc 100 10000
starting algorithm hash_pkt5 100 loops 10000 packets
took 12870 usec, 0.012870 per cycle
starting algorithm hash_hsieh 100 loops 10000 packets
took 162510 usec, 0.162510 per cycle
starting algorithm hash_pkt6 100 loops 10000 packets
took 248003 usec, 0.248003 per cycle
Surely we need to experiment a bit more, but the cost
is significant especially on low end boxes.
Maybe we could restrict the hash to just a part of the address ?
cheers
luigi
More information about the freebsd-ipfw
mailing list