[Gluster-devel] Faster hashing for DHT

Anand Avati avati at gluster.com
Thu Jan 7 04:14:18 UTC 2010


>> I note that Hsieh's SuperFastHash is already implemented in
>> GlusterFS and is used for other purposes.  It's about 3x as fast as the
>> DM hash, and has better collision resistance as well.  MurmurHash
>> (http://murmurhash.googlepages.com/) is even faster and more collision
>> resistant.  For future releases, I suggest dropping the DM hash and
>> switching to one of these others.
>>

We will investigate murmurhash and see if it suits the role better.
What we look for in the hashing algorithm is better hash distribution
specifically for filenames (ascii strings 1-255 characters long).
Better distribution of user data in general is not of much help for
DHT's needs. DM hash does a very good job in distributing hashes for
ascii strings 1-256 bytes long (file basenames). It is also the same
algorithm used in reiserfs which is very sensitive to filename hash
collisions. Speed is really a micro optimization considering the small
portion of time which is used for actual hash computation.

Will keep you posted when we have comparison results with murmurhash.

Avati





More information about the Gluster-devel mailing list