[Haskell-cafe] Simple hash table creation

Brad Larsen brad.larsen at gmail.com
Tue Nov 17 16:17:06 EST 2009


On Tue, Nov 17, 2009 at 4:00 PM, michael rice <nowgate at yahoo.com> wrote:
>
> Hi Gregory,
>
> I was wondering about that, because of the following:
>
> [1 of 1] Compiling Main             ( hash1.hs, interpreted )
> Ok, modules loaded: Main.
> *Main> ht <- new (==) dummy :: IO MyHashTable
> *Main> dummy "mike"
> 7
> *Main> dummy "michael"
> 7
> *Main> insert ht "mike" 1
> *Main> toList ht
> [("mike",1)]
> *Main> insert ht "michael" 2
> *Main> toList ht
> [("michael",2),("mike",1)]
> *Main> insert ht "miguel" 3
> *Main> toList ht
> [("miguel",3),("michael",2),("mike",1)]
> *Main> :t dummy "miguel"
> dummy "miguel" :: Int32
> *Main>
>
> It seems my dummy function is being ignored. I figured I would only be able to store a single value with a hash function that always returns 7. Why ask for a hash function and not use it?
[...]

Most hash tables deal with collisions, i.e. the case when two keys
stored in the table hash to the same value.  In the case of your
'dummy' hash function, which always returns 7, every key hashes to the
same value, hence collisions galore.

One way to deal with collisions is to hash a key to a bucket (i.e.
list) of items, then walk down the list looking for the given key.  In
such an implementation (and I believe for hash tables in general), the
quality of the hash function greatly affects the performance of the
hash table operations.

I am not sure what implementation Data.HashTable uses.  However, I
believe Data.HashTable is not all that good:  for multi-million
element tables from Int to Int, Data.IntMap runs many times faster
than Data.HashTable.  I have no wish to start a flame war here (this
topic has in the past), but the state of affairs regarding hash tables
in Haskell is not good.

Sincerely,
Brad


More information about the Haskell-Cafe mailing list