RFC: Adding a Hashable type class and HashMap/HashSet data types to HP

Bryan O'Sullivan bos at serpentine.com
Fri Nov 26 14:35:10 EST 2010


On Thu, Nov 25, 2010 at 2:23 PM, Johan Tibell <johan.tibell at gmail.com>wrote:

>
> Is this an interesting problem to optimize for?
>

I think so. The purpose of having the Hashable typeclass is so that the
universe of typed values that we can hash is open. It thus behooves us to
figure out how to do that safely. For instance, someone using the Hashable
class should be able to figure out how to hash this:

data HttpUrl = HttpUrl {
  urlScheme :: ByteString,
  urlHost :: ByteString,
  urlPort :: Int,
  urlPath :: ByteString,
  urlQuery :: Maybe ByteString
  }

Using the Hashable class, I can see how to hash the individual elements of
this, but not how to safely glue them together into a hash for the whole
value.

Over in the Java world, where they depend to an arguably ridiculous degree
on hashing, we often see horror stories of some hash function having bad
dispersal properties (or worse, equality and hashing not matching) and
causing nasty knock-on performance effects due to too many values hashing to
the same few numbers in the range.

I notice that the Hashable class doesn't have a way to supply a seed. If it
did, we could possibly chain the result of one hash as the seed of the next,
allowing us to construct a complete hash in an obvious way (although
introducing a possibly undesirable data dependency, making it difficult or
impossible to hash a large structure in parallel).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/libraries/attachments/20101126/60e14a71/attachment.html


More information about the Libraries mailing list