Proposal: Add the unordered-containers package and the hashable package to the Haskell Platform

Isaac Dupree ml at isaac.cedarswampstudios.org
Thu Mar 21 22:26:26 CET 2013


Regarding hashWithSalt determinism:

hashable 1.1:
"The general contract of hash is: * This integer need not remain 
consistent from one execution of an application to another execution of 
the same application. [...] The contract for hashWithSalt is the same as 
for hash, with the additional requirement that any instance that defines 
hashWithSalt must make use of the salt in its implementation." [1]

hashable 1.2:
"The general contract of hashWithSalt is: * If a value is hashed using 
the same salt during distinct runs of an application, the result must 
remain the same. (This is necessary to make it possible to store hashes 
on persistent media.) [...]"

Which contract do we want?

The size of Int varies across platforms, so it is difficult to make an 
instance of these Hashable classes that returns the same hash on e.g. 
x86 and x86_64. [3]

(I imagine it can be done by making sure the high bits of the returned 
Int are always 0 or sign-extended... depending on how 
unordered-containers interprets hashes [if unordered-containers is the 
relevant hash consumer].  This seems fragile.)

Thankfully, if Data.Map is only twice as slow as Data.HashMap, I 
wouldn't feel too bad about using Data.Map for its determinism and good 
worst-case asymptotics.

The type of hashes could be Word32 or Word64, or if it's better for 
performance+security and we don't care about determinism, Word.

Thoughts?

-Isaac


[1] 
http://hackage.haskell.org/packages/archive/hashable/1.1.2.5/doc/html/Data-Hashable.html
[2] 
http://hackage.haskell.org/packages/archive/hashable/1.2.0.4/doc/html/Data-Hashable.html

[3] C++11's std::hash interface has the same issue, except that at least 
it returns an *unsigned* inconsistent-width type, size_t.  I know this 
because I'm making a cross-platform-deterministic C++11 program... I 
think that if I use hash-tables there, I'll probably have to make a 
local copy of a hash-table implementation.  Or audit my code to ensure 
that it never enumerates a hash-table in a way that is affected by 
ordering.  If it were Haskell I could ensure 
enumeration-order-invariance using a class CommutativeMonoid or 
CommutativeSemigroup... ;-)



More information about the Libraries mailing list