Dan Weston westondan at imageworks.com
Thu Oct 11 14:47:57 EDT 2007

``` > One example of such an minusFP (not recommended) is (foldr xor 0):

Obviously I meant that FP = foldr xor 0. minusFP would be a simple
unfolding of this.

Dan Weston wrote:
> I am zero training in cryptography, but I would think that if in
>
>   FP(as ++ bs) = FP(bs) `plusFPFlipped` FP(as)
>
> (I think the flipped plusFP def is more intuitive)
>
> there also existed a minusFP for all f and x such that
>
>   FP(bs) = FP(as ++ bs) `minusFP` FP(as)
>
> then that might facilitate common subexpression refactorization in the
> merging of two hashes of memoized expressions.
>
> One example of such an minusFP (not recommended) is (foldr xor 0):
>
> That xor is its own inverse is no advantage. But xor is associative and
> (xor 0) is an identity, so FP([] ++ as) = FP(as ++ []) = FP(as) as
> expected. There must be more robust such monoidal functors out there.
>
> Refactorization could be limited to respect substructure boundaries
> reflected in the serialization.
>
> Dan Weston
>
> Simon Peyton-Jones wrote:
>> Interesting! The associativity property is the kind of thing I was
>> after.  Except that I don't really care if FP(as ++ bs) = FP(as)
>> `plusFP` FP(bs).  I only care that the latter is robust in the sense
>> of having low probabilty of collision.  So the Rabin scheme does more
>> than I really need (which is probably fine).
>>
>> On page 1 he draws a distinction between fingerprinting and hashing,
>> which relates to my last wish.  Does one need a different technique
>> for the two, or is it enough to reduce the number of bits in the Rabin
>> scheme, and thereby get a hashing scheme?
>>
>> Strangely the paper gives no comparison with other fingerprinting
>> schemes.
>>
>> Simon
>>
>> | -----Original Message-----
>> | Sent: 11 October 2007 11:27
>> | Subject: Re: [Haskell] Fingerprints and hashing
>> |
>> | If compositionality is important, at least Rabin's fingerprints are
>> | worth considering: http://citeseer.ist.psu.edu/broder93some.html
>> |
>> | They have the neat property that the fingerprint of a concatenation of
>> | strings can be cheaply computed from the fingerprints of the
>> | constituents. I think this effectively means that the plusFP operation
>> | can be made associative, which might offer optimization opportunities.
>> |
>> | I remember implementing it some seven years ago when prototyping for a
>> | C implementation. The code is lost, though. I can give it a shot
>> | again, if this is considered a reasonable algorithm.
>> |
>> |
>> | Lauri
>> | _______________________________________________
>> _______________________________________________