[Haskell wikibook] Re: Haskell wikibook - chapter "Theseus and the
Zipper"
apfelmus
apfelmus at quantentunnel.de
Sun Jul 8 08:58:44 EDT 2007
Dear Mr. Huet,
> Very nice indeed.
>
> The analogy with Ariadne's thread is a natural one. I use myself the
> analogy with mountaineering, you are hanging from the top of the
> datastructure using the zipper as a return rope.
>
> Compliments on a nice site on functional programming. Keep up the good
> work.
Thank you very much for the encouragement, we are delighted that you
appreciate our efforts :)
The aim of the Haskell wikibook is to write a free and gentle textbook
on Haskell and associated techniques for functional programming. Of
course, to keep up the good work, we are always looking for
contributors. In fact, the book lives through contributions from its
readers: it is built with a wiki, utilizing the same platform as
Wikipedia and hence sharing the same principle that anyone can edit.
> After I learned of Conor Mc Bride's work, I wrote some more about
> zippers for a de Bruijn's Festschrift, see item 79 in my bibliography
> [http://yquem.inria.fr/~huet/bib.html].
>
> My main application of zippers so far has been for the design of the
> Zen toolkit for computational linguistics, see items 78,82,84,85 and
> site http://sanskrit.inria.fr/huet/ZEN/index.html. This toolkit was
> reused in Haskell by the Grammatical Framework linguistic platform of
> Aarne Ranta.
Very interesting. Pondering the sharing functor, I think there is a way
to make practical the Gödel-numbering you mentioned on page 10 of item
78.
The starting point is that I personally have a slight distaste for hash
tables since they are tied to machine numbers and arrays, concepts
alien to algebraic data types and purely functional programming. I
prefer tries, i.e. finite representations of functions
key -> value
From the point of view of computational complexity, they are optimal:
lookup and insertion all take O(|size of key in bits|) time.
Tries can be obtained systematically from the structure of the key type
by the isomorphisms
(key1 + key2) -> value ≅ (key1 -> value) × (key2 -> value)
trie for a sum type ⇒ pair two tries for the components
(key1 × key2) -> value ≅ (key1 -> (key2 -> value))
trie for a product type ⇒ nest the tries for the components
See also
Ralf Hinze. Generalizing generalized tries. Journal of Functional
Programming, 10(4):327-351, July 2000
http://www.informatik.uni-bonn.de/~ralf/publications/GGTries.ps.gz
From now on, the function arrow -> shall denote a finite map from keys
to values, so that every -> type is represented by an algebraic data
type. In particular, it is possible to construct a generalized trie
with a key type ([Char] -> Bool) representing a set of words
(Haskell-like syntax, I don't know ML). Thus, both your hash table with
subtries of the dictionary as keys and a generalized trie with subtries
as keys represent a map
([Char] -> Bool) -> value ≅ (µX. Bool × (Char -> X)) -> value
Of course, replacing the hash table with the generalized trie
corresponds to the one-to-one Gödel numbering of the keys as you
mentioned on page 10 of item 78. In a sense, the subtries as keys are
(a representation of) their own Gödel numbers. But now, Gödel-number
equality is the structural equality of the tries which defeats the hole
purpose of avoiding repeated traversal of the subtries of the
dictionary. Put differently, even a numeric Gödel-numbering cannot
decrease the size of the tries or it would not be able to represent
every possible trie.
The solution is to restrict the Gödel-numbering to the set of tries we
are interested in, namely to all the subtries of the dictionary. The
intention is that the Gödel numbers sort of simply enumerate the
subtries from 1 to n. In other words, the finite map storing the
subtries becomes a structure storing "local Gödel numbers"
type Subtrie = µX. Bool × (Char -> X)
type Gödel = Int
type GödelStore ≅ Subtrie -> Gödel
The key is that the Gödel numbers can now be used to break the
recursion, yielding
type GödelStore = (Bool × (Char -> Gödel)) -> Gödel
Put differently, a subtrie is represented by its topmost branches with
the children being Gödel numbers instead of hole tries. By traversing
the dictionary bottom-up, we have the Gödel numbers of subtries
available before assigning a Gödel number to the parent. The parent
gets a fresh Gödel number when its not in the GödelStore, otherwise we
use the Gödel number from the store and thus share this subtrie. In
fact, your current hashing scheme proceeds essentially the same way: a
hash number is calculated from the already calculated hash numbers of
the children.
Maybe your current hash-table implementation could already benefit from
Gödel numbers. The insight is to separate Gödel numbers representing
hole subtries and hash numbers which may be used to implement the
"shallow" GödelStore as a hash table. The benefit is that in case of a
collision, the expensive test for structural equality is replaced with
a cheap check for Gödel number equality.
All the best,
apfelmus
> Le 30 juin 07 à 10:40, apfelmus at quantentunnel.de a écrit :
>> Dear Mr. Huet,
>>
>> Fascinated by the zipper since reading your publication, the authors
>> of the Haskell wikibook have written a special introduction to this
>> data structure:
>>
>> "Theseus and the Zipper".
>> http://en.wikibooks.org/wiki/Haskell/Zippers
>>
>> starring Theseus and Ariadne. We hope you will enjoy it :)
>>
>> All the best,
>> apfelmus
More information about the Wikibook
mailing list