[Haskell-cafe] Very imperfect hash function

Sun Jan 31 15:04:40 EST 2010

On 31 Jan 2010, at 20:07, John Lato wrote:

>>> Or are you suggesting an actual hash table?
>>
>> The hash function folds the keys onto an interval. Since you have  
>> Int values
>> k, you might just use a mod k n function for that.
>>
>>> If it's the
>>> latter, I'm not certain where the array fits into the picture.  I'm
>>> pretty sure I'm missing something here.
>>
>> There is a module Data.HashTable. The array is just to make lookup  
>> fast.
>> Like in:
>>  ST s (STArray s HashKey (Map Key Value))
>
> I was misunderstanding your intention here.  This is an interesting
> approach, but I would still try an IntMap first.

A simple hash-function for strings is to simply exclusive-or the bytes  
and then reduce modulo a prime number, which will be the table size.  
The average lookup time complexity is the one of the hash table. You  
can then compare with the IntMap, which may have time complexity O(4),  
the number of bytes in an Int32.

There might be a big difference in the constant factor, though: an  
array seems to be much faster. I first made multivariate polynomial  
division using lists, because it is easy to work with, and an example  
of some stuff built on top took tens of seconds to run in Hugs. When  
reworking it using STArray (mutable) and Array (immutable), it has no  
noticeable delay.

Though I am using Map and Set, too, the choice of container seems  
otherwise be dictated by what is most convenient to program.

   Hans