[Haskell-cafe] Functors and the Visitor Pattern

Thu Jun 4 22:45:17 EDT 2009

Johan Tibell wrote:
> wren ng thornton wrote:
> > [2] For the recursive Visitor pattern I use most often, that is. For the
> > non-recursive version it's usually fmap. This is the part where the pattern
> > gets a bit shaky because there are actually many different patterns all
> > called "Visitor". The main points of interest are whether it's recursive or
> > not, and whether it applies the visitor to itself, to its children, or both.
> >
> > non-recursive + itself == ($)
> >
> > non-recursive + children == fmap (under open-recursion interpretation of
> > the type, aka all nodes are elements)
> >
> > recursive + children == fmap (under closed-recursion interpretation, aka
> > only fringe nodes are elements)
> >
> > recursive + both == cata (usually, though it depends how you aggregate)
> >
> > recursive + itself == This is actually a variant of the Iterator pattern
> 
> 
> Could you be so kind to give an example for each?

In OOP you mean?

/* nonrecursive + self == application */
class A { T app(Visitor v) { return v.visit(this); } }
class B { T app(Visitor v) { return v.visit(this); } }
...
// An allomorphic function :: (A | B | ...) -> T
class Visitor {
     T visit(A a) { ... }
     T visit(B b) { ... }
     ...
}

This particular version often isn't too helpful because it's just 
reflecting the method call, we could've just called visit directly 
instead of calling app. But there are times where it is useful, 
particularly when you want to have some visitors which are recursive and 
some which are not. In which case it doesn't matter which method you 
start with, but you do need both in order to reflect back on recursion.

/* nonrecursive + children == fmap (with real parametricity) */
class F<A> {
     Children<A> as;
     F(Children<A> as) { this.as = as; }

     F<B> fmap(Visitor<A,B> v) {
         Children<B> bs = new Children<B>();
         for (A a : this.as)
             bs.add( v.visit(a) );
         return new F<B>(bs);
     }
}
...
interface Visitor<A,B> { B visit(A a); }

This is a rather Haskellish take on this version. In practice people 
often don't bother supporting parametricity (needed for making F a real 
functor). That is, usually they'll do destructive updates to F, only 
have endofunction visitors (so there's no change of types), or use 
side-effect only visitors (see below).

/* recursive + children == fmap (side-effect only) */
abstract class Tree { abstract void rmap(Visitor v); }
class Branch extends Tree {
     Children<Tree> subtrees;

     void rmap(Visitor v) {
         for (Tree t : this.subtrees)
             v.visit(t);
     }
}
class Leaf extends Tree {
     void rmap(Visitor v) {
         // Just in case we're the root node.
         v.visit(this);
         // Or we could do nothing instead,
         // depending on desired semantics
     }
}
class Visitor {
     void visit(Branch t) { t.rmap(this); } // reflect to recurse
     void visit(Leaf t) { ... } // don't reflect or you'll hit _|_
}

This highlights an additional axis of variation in the many different 
visitor patterns, whether the "result" is returned directly (as in the 
previous example), whether it is accumulated in the Visitor itself 
(requiring explicit lookup later), or whether it's done via side-effects 
on global state. The accumulator and side-effect versions are a bit more 
general since their "return type" isn't restricted by the classes being 
visited.

/* recursive + self/both == a kind of Iterator/catamorphism */
abstract class Tree { abstract void observe(Visitor v); }
class Branch extends Tree {
     Children<Tree> subtrees;

     void observe(Visitor v) {
         v.visit(this);

         for (Tree t : this.subtrees)
             t.observe(v);
     }
}
class Leaf extends Tree {
     void observe(Visitor v) {
         v.visit(this);
     }
}
class Visitor {
     void visit(Branch t) { ... }
     void visit(Leaf t) { ... }
}

This is different from the recursive+children version because this 
version keeps all of the recursion code on the side of the visited 
classes, and it also meaningfully visits interior nodes. In the 
recursive+children version the visitor ignored branches (though it 
doesn't need to) and reflected back to initiate recursion, whereas this 
version will recurse no matter what the visitor does (barring 
exceptions, etc).

This version is a push iterator which forces you to visit all nodes, 
rather than the more usual pull iterator where you need to call next() 
to get the next node. We can convert between the two varieties by using 
co-routines or threads or other control-flow tricks.

If we reverse the order of the recursive observe and the visit(this) 
then we get something like a catamorphism. Whether it's actually a 
catamorphism depends on what the visitor does, or rather what knowledge 
about the shape of Branch and Leaf it makes use of. "Real" catamorphisms 
are fairly rare in OOP, though you often find things like using a 
visitor to add decoration to a tree (which is much like passing the 
initial algebra to cata) or maintaining some aggregation in the visitor.

-- 
Live well,
~wren