[Haskell-cafe] What's this pattern called?

Fri Oct 23 00:27:39 EDT 2009

Martijn van Steenbergen wrote:
> Bonjour café,
> 
>> data ExprF r
>>   =  Add  r  r
>>   |  Sub  r  r
>>   |  Mul  r  r
>>   |  Div  r  r
>>   |  Num  Int
> 
> This is a well-known pattern that for example allows nice notation of 
> morphisms. But what is it called? I've heard fixed-point view, open 
> datatypes and some others, but I'm curious where this pattern comes up 
> in literature and what it is called there.

This is an example of "open recursion", which is when you take some 
recursive function/datatype and rewrite it without recursion by passing 
the function/type in as an argument to itself. It's the datatype 
equivalent of doing:

     fibF _ 0 = 0
     fibF _ 1 = 1
     fibF f n = f(n-1) + f(n-2)

     fib = fix fibF

Which can be useful for functions because we can use a different 
fixed-point operator, e.g. one that adds memoization abilities or other 
features in addition to the recursion.

As others've mentioned, the open-recursive version of a recursive data 
type happens to be a "functor". Or rather, the recursive type happens to 
be isomorphic to the least fixed point of a generating functor[1][2] 
because the functor is also, in the terms of recursion theory, an 
"initial algebra". Part of why this pattern is so nice comes from the 
fact that it's a functor (so we can use fmap to apply a function one ply 
down), but part of it also comes from the isomorphism of using an 
explicit fixed-point operator (which allows us to un-fix the type and do 
things like storing the accumulators of a fold directly in the normal 
constructors, rather than needing to come up with an ad-hoc isomorphic 
set of constructors[3]), and the fact that it's an initial algebra ties 
these two things together nicely.

This is also an example of Tim Sheard's "two-level types", albeit a 
trivial one since the fixed-point operator doesn't add anything other 
than recursion. One of the particular ideas behind Sheard's two-level 
types is that we can split the original recursive type in a different 
place where one of the levels contains some constructors and the other 
level contains other constructors. This can be helpful when you have a 
family (informally speaking) of similar types, as for example with 
implementing unification. All types that can be unified share 
constructors for unification variables; but the constructors for the 
structural components of the type are left up to another level. Thus we 
can reuse the variable processing code for unifying different types, and 
also be modular about the type being unified.

[1] This should be somewhat obvious if you're familiar with the 
inductive phrasing of constructing the set of all values for some type. 
E.g. "Basis: [] is a list. Induction: (:) takes a value and a list into 
a list". So we have some functor and we keep applying it over and over 
to generate the set of all values, building up from the base cases.

[2] Do note that in Haskell the least fixed point and the greatest fixed 
point coincide. Technically, whether the least or greatest fixed point 
is used depends on the construction (e.g. catamorphisms use least, 
anamorphisms use greatest). This is also related to the topic of 
"codata" which is the fixed point of a terminal coalgebra.

[3] Data.List.unfoldr is a prime example of an ad-hoc isomorphic set of 
constructors. Instead of the current type, we could instead use an 
implementation where:

     newtype Fix   f   = Fix { unFix :: f (Fix f) }
     data    ListF a r = Nil | Cons a r
     type    List  a   = Fix (ListF a)

     unfoldr :: (b -> ListF a b) -> b -> List a
     unfoldr = ...

which is a bit more obviously correlated with anamorphisms in recursion 
theory.

-- 
Live well,
~wren