patch: portable dirhash

David Malone dwmalone at maths.tcd.ie
Wed Dec 17 11:03:11 PST 2003


On Wed, Dec 17, 2003 at 01:09:18PM -0500, Ted Unangst wrote:
> while on the subject, there's a piece of code something like this in 
> freebsd:
> 	/*
>         * We hash the name and then some other bit of data that is
>         * invariant over the dirhash's lifetime. Otherwise names
>         * differing only in the last byte are placed close to one
>         * another in the table, which is bad for linear probing.
>         */
>        hash = hash32_buf(name, namelen, HASHINIT);
>        hash = hash32_buf(dh, sizeof(dh), hash);
> 
> which isn't doing what you'd expect (hashing the dh pointer), instead it 
> hashes the first bytes of dh, conveniently a constant value so it works, 
> but it provides no benefit.  fix is making it &dh.  i'd provide a diff, 
> but it's a little large. :)

Ahhh! Well spotted. Actually, it does provide the benefit, even
though it it wasn't the value that I'd intended to hash here. Because
of the way that the fnv hash works, turning the handle on the hash
function one more time should usually split up similar filenames.

I'll have a quick think about which one is really better to hash.
I don't think there's an advantage to either in particular.  If we
were to start reallocating dh_hash without freeing dh, then we'd
get a slightly different hash function each time, which might be
a slight advantage. However, we don't do that right now, so...

	David.


More information about the freebsd-hackers mailing list