[Haskell-cafe] Pure hashtable library

Wed Aug 27 11:55:41 EDT 2008

On Wed 2008-08-27 16:21, Jules Bean wrote:
> Bulat Ziganshin wrote:
>> Hello haskell-cafe,
>>
>> solving one more task that uses English dictionary, i've thought: why we don't yet have pure hashtable library? There is imperative hashtables, pretty complex as they need to rebuild entire table as it grows. There is also simple assoc lists and tree/trie implementations, but there is no simple non-modifiable hashes.
>>
>> how should it look:
>> * hashtable is represented as an array of assoc lists: Array Int [(a,b)]
>>
>> * interface to extract data from ht is the same as from assoc list:
>> lookup :: HT a b -> a -> Maybe b
>>
>> * ht may be built from assoc list. we should just know it's size beforehand in order to create Array of reasonable size. constructor also need a hashing function:
>>
>> create :: [(a,b)] -> Int -> (a->Int) -> HT a b
>>
>>
>> given these constraints, it should be just a 10-20 lines of code, and provide much better efficiency than any tree/trie implementations
>
> Prove it.
>
> To get "much better efficient" than a trie, the hash function has to be  
> so fast that it is faster than following (log n) pointers, and yet also  
> so "perfect" that it doesn't generate too many collisions.

Many people have probably seen this and it has nothing to do with
Haskell, but it is a good performance comparison of a simple hash to an
optimized trie.

  http://www.nothings.org/computer/judy/

The conclusion (we're only interested in lookup times) is that the trie
is preferable for sequential lookups, slower for random access hits, and
about the same for random access misses.  Also, the hash was uniformly
better for small sizes (< 10k).

BTW a single cache miss has a latency around 250 cycles, you can compute
a hell of a hash for that.

Jed
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
Url : http://www.haskell.org/pipermail/haskell-cafe/attachments/20080827/124b1cf0/attachment.bin