[Haskell] Re: Fingerprints and hashing

Thu Oct 11 21:07:25 EDT 2007

On Oct 11, 2007, at 4:33 PM, apfelmus wrote:
...
>
> So, the idea is to use a "local gödel numbering" and uniquely  
> number the and only the trees that are actually constructed (no  
> collisions, but few in numbers). In other words, every new tree  
> created gets the gödel number  size collection + 1  and we can  
> simply use
>
>   type GödelNumber = Word32
>
> assuming that no more than 4G of trees will ever be constructed.  
> For fast hash-consing, the collection itself is a (generalized) trie
>
>   type Collection  = ExpF GödelNumber ~~> Maybe GödelNumber
>
> mapping each new top of a tree (constructor + gödel numbers for  
> already known subtrees) to either Just its gödel number or Nothing  
> when it's not in the collection yet.
>
> With this, CSE becomes a catamorphism that allocates new gödel  
> numbers if necessary
> ...
> CSE in the sense that all common subexpressions now have the same  
> gödel number.
>
> Of course, the drawback compared to "free-form" fingerprinting is  
> that the fingerprinting and the collection now depend on each  
> other. But for CSE, we have to carry the collection around anyway.
>
>
> I don't know any references for that method since I came up with it  
> myself and haven't searched around yet. Any pointers?

Actually, the paper that Lauri cited in his message mentions  
essentially this technique; this is equivalent to the permutation T()  
that they impose when fingerprinting a DAG.  That citation again:
     http://citeseer.ist.psu.edu/broder93some.html

But it's generally pretty well known.

-Jan