[Haskell-cafe] Very imperfect hash function
Hans Aberg
haberg at math.su.se
Tue Feb 2 03:50:07 EST 2010
On 2 Feb 2010, at 03:05, Richard O'Keefe wrote:
>> A simple hash-function for strings is to simply exclusive-or the
>> bytes and then reduce modulo a prime number,
>
> Simply exclusive-oring the bytes will give you at most 256 distinct
> results. (For an ASCII source, 128 distinct results.) After that,
> there hardly seems to be any point in reduction modulo a prime.
Right - I just gave an example of how simple hash functions may be
created. The original question deals with Int32s, not strings.
As already mentioned before, there are more advance libraries here
http://en.wikipedia.org/wiki/Perfect_hash_function
but they are not written in Haskell. So you have to rewrite them or
use the FFI.
> This approach can't tell a CAT from an ACT or a DOG from a GOD, which
> is another strike against it. (It also can't tell a TITTLE from a
> TILE,
> or a BOTTLE from a BOLE, for obvious reasons.)
The hash function just tries to produce random lookup values, which
are used to flatten the average depth of the lookup table at the
entries of the array. So there is a tradeoff between a fast hash
function, and one that does a good job.
Hans
More information about the Haskell-Cafe
mailing list