RFC: Adding a Hashable type class and HashMap/HashSet data types to HP

Johan Tibell johan.tibell at gmail.com
Thu Nov 25 12:16:57 EST 2010


On Thu, Nov 25, 2010 at 5:15 PM, Thomas Schilling
<nominolo at googlemail.com> wrote:
> Regarding hashing of ADTs, I agree with you that hash () == hash False
> should be no problem.  The general principles would be:
>
>  1.  prop_hashEq x y = x == y ==> hash x == hash y
>
> and for a data type with constructors C_1 .. C_n
>
>   2. hash (C_i x_i1 ... x_iN) = hash i <> hash x_i1 <> ... <> hash x_iN
>
> where <> is the hash combine function (probably (.))
>
> Principle (1) may require to use some form of normalisation before
> applying principle (2), e.g., for Set, differently balanced trees
> should still produce the same hash.
>
> Does this look reasonable?

(1) is a property that most data structures would require. I don't
think it needs to be enforced in the Hashable type class.

> Also, I'm fine with the return of hash being an architecture-specific
> Word type, but we also need variants with a known bit-width of the
> resulting hash.  For example, a Word32 hash has way more collisions
> than a Word64 hash, and this is important for some applications.

I agree. I don't know if the specific sized hashes should be in the
Hashable type class though. Typically you only need these for a few
string like types.

Johan


More information about the Libraries mailing list