[Haskell] Fingerprints and hashing
westondan at imageworks.com
Thu Oct 11 14:45:35 EDT 2007
I am zero training in cryptography, but I would think that if in
FP(as ++ bs) = FP(bs) `plusFPFlipped` FP(as)
(I think the flipped plusFP def is more intuitive)
there also existed a minusFP for all f and x such that
FP(bs) = FP(as ++ bs) `minusFP` FP(as)
then that might facilitate common subexpression refactorization in the
merging of two hashes of memoized expressions.
One example of such an minusFP (not recommended) is (foldr xor 0):
That xor is its own inverse is no advantage. But xor is associative and
(xor 0) is an identity, so FP( ++ as) = FP(as ++ ) = FP(as) as
expected. There must be more robust such monoidal functors out there.
Refactorization could be limited to respect substructure boundaries
reflected in the serialization.
Simon Peyton-Jones wrote:
> Interesting! The associativity property is the kind of thing I was after. Except that I don't really care if FP(as ++ bs) = FP(as) `plusFP` FP(bs). I only care that the latter is robust in the sense of having low probabilty of collision. So the Rabin scheme does more than I really need (which is probably fine).
> On page 1 he draws a distinction between fingerprinting and hashing, which relates to my last wish. Does one need a different technique for the two, or is it enough to reduce the number of bits in the Rabin scheme, and thereby get a hashing scheme?
> Strangely the paper gives no comparison with other fingerprinting schemes.
> | -----Original Message-----
> | From: haskell-bounces at haskell.org [mailto:haskell-bounces at haskell.org] On Behalf Of Lauri Alanko
> | Sent: 11 October 2007 11:27
> | To: haskell
> | Subject: Re: [Haskell] Fingerprints and hashing
> | If compositionality is important, at least Rabin's fingerprints are
> | worth considering: http://citeseer.ist.psu.edu/broder93some.html
> | They have the neat property that the fingerprint of a concatenation of
> | strings can be cheaply computed from the fingerprints of the
> | constituents. I think this effectively means that the plusFP operation
> | can be made associative, which might offer optimization opportunities.
> | I remember implementing it some seven years ago when prototyping for a
> | C implementation. The code is lost, though. I can give it a shot
> | again, if this is considered a reasonable algorithm.
> | Lauri
> | _______________________________________________
> | Haskell mailing list
> | Haskell at haskell.org
> | http://www.haskell.org/mailman/listinfo/haskell
> Haskell mailing list
> Haskell at haskell.org
More information about the Haskell