RFC: Adding a Hashable type class and HashMap/HashSet data types to HP

Gregory Collins greg at gregorycollins.net
Thu Dec 2 11:09:09 CET 2010


Hashes are almost always word-sized these days. If you want to use
them as an index into a fixed-size table, you mod or otherwise
truncate the hash.

If instead you want to use the hash value to store into a patricia
trie like IntMap, you would like to use the entire word to avoid
collisions. The reason the CPU likes word-sized hash codes is that you
can compare them using register instructions. On 64-bit machines we
should try to use Word64!

G

On Thu, Dec 2, 2010 at 9:40 AM, Malcolm Wallace <malcolm.wallace at me.com> wrote:
> I am probably showing my ignorance here, but I am puzzled as to how a hash
> value of type Word32 helps anyone.  When I learnt CS a long time ago (and I
> even implemented a Hashable class for Haskell in the early 1990s), a hash
> value was always used as the index into a constant-time lookup table.  I
> certainly don't want tables of size 2^32 in any programs I write, even if I
> do have 12G of memory in my machine.
>
> Way back when, my Hashable method needed to know what size of table you
> wanted in order to generate the hash.  Of course, you could resize tables
> upwards if required too.
>
> So I'm imagining you have a different idea in mind for how to store a
> HashMap?  Will it be a rebalancing tree structure, using the ordering on
> Hashes as keys?  This changes the computational complexity of both storage
> and lookup, and I begin to wonder whether you will gain any performance
> ultimately.
>
> Regards,
>    Malcolm
>
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://www.haskell.org/mailman/listinfo/libraries
>



-- 
Gregory Collins <greg at gregorycollins.net>



More information about the Libraries mailing list