BSD license compatible hash algorithm?

perryh at pluto.rain.com perryh at pluto.rain.com
Sun Dec 30 20:14:23 PST 2007


"Aryeh M. Friedman" <gmail.com!aryeh.friedman at agora.rdrop.com> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Dag-Erling Sm??rgrav wrote:
> > "Aryeh M. Friedman" <aryeh.friedman at gmail.com> writes:
> >> All hashs have issues with pooling.... see
> >> http://www.burtleburtle.net/bob/hash/index.html... btw it is
> >> a old wives tale that the number of buckets should be prime
> >> (mostly based on the very weak implementation Knuth offered)
> >
> > Not an "old wives' tale", but rather an easy way to implement a
> > hash algorithm that is good enough for most simple uses: metric
> > modulo table size, where metric is a number derived from the
> > item in such a manner as to give a good spread.
>
> ... the above only applies if your using a very primitive hash
> like Knuth's multiplication one.... every modern hash I know of
> should have 2^k buckets actually for some k ...

It very much depends on what is used for a rehash (collision step)
value.  The step value and the table size must have no common factor
larger than 1, or there will be edge cases (bugs) in which some
combinations of hash and step values will cause the table to appear
full when in fact it is not.  Making the table size prime is one
simple way of preventing such problems, while still allowing the
rehash value to depend on the key (thus reducing the liklihood of
collision on the second probe).

At the other extreme, the table can be any size at all if the step
value is 1 (and the "modulo table size" operation will certainly be
more efficient if the table size is 2^k).


More information about the freebsd-hackers mailing list