RFC: Adding a Hashable type class and HashMap/HashSet data types
johan.tibell at gmail.com
Thu Nov 25 12:16:57 EST 2010
On Thu, Nov 25, 2010 at 5:15 PM, Thomas Schilling
<nominolo at googlemail.com> wrote:
> Regarding hashing of ADTs, I agree with you that hash () == hash False
> should be no problem. The general principles would be:
> 1. prop_hashEq x y = x == y ==> hash x == hash y
> and for a data type with constructors C_1 .. C_n
> 2. hash (C_i x_i1 ... x_iN) = hash i <> hash x_i1 <> ... <> hash x_iN
> where <> is the hash combine function (probably (.))
> Principle (1) may require to use some form of normalisation before
> applying principle (2), e.g., for Set, differently balanced trees
> should still produce the same hash.
> Does this look reasonable?
(1) is a property that most data structures would require. I don't
think it needs to be enforced in the Hashable type class.
> Also, I'm fine with the return of hash being an architecture-specific
> Word type, but we also need variants with a known bit-width of the
> resulting hash. For example, a Word32 hash has way more collisions
> than a Word64 hash, and this is important for some applications.
I agree. I don't know if the specific sized hashes should be in the
Hashable type class though. Typically you only need these for a few
string like types.
More information about the Libraries