RFC: Adding a Hashable type class and HashMap/HashSet data types to HP

Yitzchak Gale gale at sefer.org
Sat Nov 20 14:26:45 EST 2010


Johan Tibell wrote:
> * Instances of Hashable for common types, in particular ByteString and Text.

And of course, all of the numeric types, unit, and Bool, Maybe, and Either.

More interesting are Maybe, Either, lists, tuples, Data.Map, Data.Set,
etc. These are far from trivial; I did some work on it for Python. If you
think you can just do something fairly simple and reasonable and hope
that it will have the right statistical properties, I promise you that it won't
work.

Unless someone knows of a better methodology, you need to
develop a battery of QuickCheck-like tests to build complexity into
data structures in various ways, then use trial and error to come
up with an algorithm that gives results within some tolerance for all
of the tests while still being very fast.

> I already have fast instances for Text and ByteString, based on MurmurHash2 [2].

Ah, I see from that site that they used pretty much the same
kinds of techniques. They actually used chi-square, I didn't get that
fancy.

Theoretically, if the hashes for Maybe and Either are
good enough, all the other container hashes could be
built out of those. But I doubt that would be very easy.
Anyway, the main work is building all the tests, which
you would have to do in any case.

Regards,
Yitz


More information about the Libraries mailing list