[Haskell-cafe] Chuch encoding of data structures in Haskell
Günther Schmidt
gue.schmidt at web.de
Thu May 27 13:44:16 EDT 2010
Hello C,
thank you for explaining.
The funny thing is that I have never seen anybody take this even a
single step further than you have in your email.
In particular I have not found anything where someone might use church
encoding to solve a quite practical problem, namely for implementing
extensible records.
For instance suppose we'd have
data Person = Person { firstName :: String, lastName :: String }
person1 :: Person
person1 = Person "John" "Doe"
alternatively, and I'm not 100% if this what is commonly understood as
"church encoding" I'm just guessing, we could write this:
person1' :: (String -> String -> x) -> x
person1' = \fn -> fn "John" "Doe"
Now from a practical standpoint, up to here anyway, the first encoding
is far more useful.
But there is a problem when we want to extend the Person type by a new
field, age for instance. Person as such is *not* extensible, modifying
it would require us to rewrite existing code.
But person1' is easily extended:
person1'' :: (String -> String -> Int -> x) -> x
person1'' = \fn -> person1' fn 36
or more generally:
(.+.) :: (t1 -> t -> t2) -> t -> t1 -> t2
rec .+. v = \fn -> rec fn v
person1''' :: (String -> String -> Int -> x) -> x
person1''' = person1'' .+. 36
There is also a way to shrink the "record" from the left:
drp :: ((t1 -> t) -> b) -> t -> b
drp rec = \fn -> rec $ \_ -> fn
I can think of a number of applications for this and especially I
believe it should be possible to create more "combinators".
The approach is so simple and trivial that it must have occurred to
people a hundred times over. Yet I do not find any other examples of
this. Whenever I google for church encoding the examples don't go beyond
church numerals.
Am I googling for the wrong keywords?
Best regards
Günther
Am 27.05.10 19:10, schrieb C. McCann:
> 2010/5/27 Günther Schmidt<gue.schmidt at web.de>:
>> I'm exploring the use of church encodings of algebraic data types in
>> Haskell.
>> Since it's hard to imagine being the first to do so I wonder if folks here
>> could point me to some references on the subject.
>>
>> I'm looking for examples of church encodings in Haskell a little bit beyond
>> Church Booleans and Church Numerals.
>
> The fully general description of Church encoding is rather simple, but
> I've rarely seen it described explicitly. Consider the type of Church
> encodings for Bool, Either, and the 2-tuple (written here as "Pair"
> for clarity):
>
> churchedBool :: t -> t -> t
> churchedEither :: (a -> t) -> (b -> t) -> t
> churchedPair :: (a -> b -> t) -> t
>
> And compare the signatures for the constructors of the non-Church encoded types:
>
> True :: Bool
> False :: Bool
> Left :: a -> Either a b
> Right :: b -> Either a b
> Pair :: a -> b -> Pair a b
>
> We can observe two patterns: 1) The Church encodings take as many
> arguments as the type has constructors 2) The type of each argument is
> the same as the signature of a constructor, except returning an
> arbitrary type. As this suggests, "Church decoding" is as simple as
> applying the Church encoded type to each of the constructors.
>
> From the above, a general description of Church encoding can be
> deduced: The encoding of a value is a function that replaces each data
> constructor with an arbitrary function. The Church encoding
> represents, in a way, the most generalized means of using values of
> that type--which is why Haskell already includes variations of "Church
> encode" functions for a few standard types, like so:
>
> encodeEither x = \f g -> either f g x
> encodeMaybe x = \z f -> maybe z f x
> encodeTuple x = \f -> uncurry f x
> encodeBool x = \t e -> if x then t else e
>
> But what of Church numerals? First, we must consider the
> Church-encoding of recursive data types. Given arbitrary nested data
> types, there's nothing else that can be done--the outer types know
> nothing of the types they contain. But if an inner type is known to be
> the same as the outer type, there are two options for the encoding:
> Work only with the outermost value, as with non-recursive types, or
> work with the recursive value as a whole, by having the outermost
> value pass its arguments inward.
>
> Now, consider the signature of a Church numeral:
>
> churchedNumeral :: (t -> t) -> t -> t
>
> Given the above, what can we say about the equivalent "decoded" data
> type? It takes two arguments, so we have two constructors. The second
> argument is a single value, so the associated constructor is nullary.
> The first argument must be associated with a unary constructor, but
> look at the parameter it takes: the same type as the result! This is
> how we can tell that Church numerals are the encoding of a recursive
> type. Since the only way a recursive constructor can do anything with
> the values it contains is to pass its arguments inward, the value it
> has to work with is the result of doing so. Thus, the other
> constructor must take a single argument of its own type. What does
> this look like as a standard data type?
>
> data Nat = S Nat | Z
>
> Good old inefficient Peano numbers, of course!
>
> Keeping all that in mind, consider a List type, and both ways of encoding it:
>
> data List a = Nil | Cons a (List a)
>
> churchedListOuter :: t -> (a -> ___ -> t) -> t
> churchedListRecursive :: t -> (a -> t -> t) -> t
>
> For the "outermost" method, what type belongs in place of the ___? The
> second argument of Cons is itself a List, so naively we would like to
> simply put the type of "churchedListOuter" itself in place of ___, but
> that won't work. In fact, nothing will work here, because recursion on
> an "outermost encoded" list is impossible in a typed λ-calculus
> without some means of using general recursion provided as a primitive
> operation. Nested tuples can be used, but the length of the list will
> be reflected in the type, preventing polymorphism over arbitrary
> lists. This is also why Church encoding is most often seen in the
> setting of the untyped λ-calculus.
>
> The implicitly recursive encoding, however, presents no such problems.
> So, perhaps a function to Church encode lists would be useful? Indeed
> it would, and as before, it already exists:
>
> encodeList x = \z f -> foldr f z x
>
> Recall the earlier observation that "decoding" involves applying the
> encoded type to its equivalent constructors; the same holds true for
> recursive types, as demonstrated by right-folding a list with (:) and
> [], or applying a Church numeral to 0 and (+ 1).
>
> Regarding recursive data types as the least fixed point of a
> non-recursive data type is thus tied to replacing the "outermost"
> encoding with the "recursive" encoding, and the "Church encode"
> function for a recursive type is simply a generalized right fold,
> partially applied.
>
> Now, the descriptions above are rather informal, and ignore the
> possibility of infinite lazy recursive data structures, among other
> gaps; but perhaps will help to get you started regardless.
>
> - C.
More information about the Haskell-Cafe
mailing list