Tuesday, July 4th, 2006

Perfect Hashing

Filed under: — Daniel Lemire @ 12:28

Suppose you could build a collision-free hash table, how fast would it be? It would be extremely fast, almost as fast as looking up data in an array.

As it turns out, collision-free hash tables have been possible for quite something and that’s called perfect hashing. See for example GNU gperf, for an implementation.

One way to building a collision-free hash table is to use two hashes: r mod q and r mod p. It is important that p and q be coprimes and that q be large enough to store all of your data. Then, these two hashes are used to create two layers of hash tables (h1 and h2): given the key a, you retrieve its value h(a) by computing h1(h2(a)). In other words, h1 stores the values and h2 uses your keys. Using two hash tables buys you the freedom of choosing (through heuristics) the values of h2, or equivalently, the keys of h1.

The downsides are that construction time might be slower and that the data structure cannot be easily updated. The upsides is that you can use very little storage and have tremendously fast look-ups.

Reference:
S. Lefebvre, H. Hoppe, Perfect spatial hashing, ACM SIGGRAPH 2006.

No Comments »

No comments yet.

RSS feed for comments on this post.

Leave a comment

Warning: When entering a long comment, please ensure that you make copy of your text prior to submitting it. If the server should fail or if you hit a bug, you might lose your work. I am not responsible for your lost effort.

To spammers: I carefully review every single post and make sure that spam gets deleted. You are wasting your time if you are manually entering spam using this form. Read my terms of use to see what I consider to be abusive.

Example: I + II + IX= XII. Yes, you have to enter a roman numeral. (Answer must be in upper case.)

« Blog's main page

38 queries. 1.092 seconds. Valid XHTML

Powered by WordPress

Subscribe to this blog in a reader or by Email.