[Haskell-cafe] Church vs Boehm-Berarducci encoding of Lists

Thu Sep 20 10:15:38 CEST 2012

Dan Doel wrote:
> >> P.S. It is actually possible to write zip function using Boehm-Berarducci
> >> encoding:
> >>         http://okmij.org/ftp/Algorithms.html#zip-folds
>
> If you do, you might want to consider not using the above method, as I
> seem to recall it doing an undesirable amount of extra work (repeated
> O(n) tail). 
It is correct. The Boehm-Berarducci web page discusses at some extent
the general inefficiency of the encoding, the need for repeated
reflections and reifications for some (but not all) operations. That
is why arithmetic on Church numerals is generally a bad idea.

A much better encoding of numerals is what I call P-numerals
	http://okmij.org/ftp/Computation/lambda-calc.html#p-numerals
It turns out, I have re-discovered them after Michel Parigot (so my
name P-numerals is actually meaningful). Not only they are faster; one
can _syntactically_ prove that PRED . SUCC is the identity.

The general idea of course is Goedel's recursor R.

   R b a 0 = a
   R b a (Succ n) = b n (R b a n)

which easily generalizes to lists. The enclosed code shows the list
encoding that has constant-time cons, head, tail and trivially
expressible fold and zip.

Kim-Ee Yeoh wrote:
> So properly speaking, tail and pred for Church-encoded lists and nats
> are trial-and-error affairs. But the point is they need not be if we
> use B-B encoding, which looks _exactly_ the same, except one gets a
> citation link to a systematic procedure.
>
> So it looks like you're trying to set the record straight on who actually
> did what.

Exactly. Incidentally, there is more than one way to build a
predecessor of Church numerals. Kleene's solution is not the only
one. Many years ago I was thinking on this problem and designed a
different predecessor:

excerpted from http://okmij.org/ftp/Haskell/LC_neg.lhs

    One ad hoc way of defining a predecessor of a positive numeral
		    predp cn+1 ==> cn
    is to represent "predp cn" as "cn f v"
    where f and v are so chosen that (f z) acts as
	    if z == v then c0 else (succ z)
    We know that z can be either a numeral cn or a special value v. All
    Church numerals have a property that (cn combI) is combI: the identity
    combinator is a fixpoint of every numeral. Therefore, ((cn combI) (succ
    cn)) reduces to (succ cn). We only need to choose the value v in such
    a way that ((v I) (succ v)) yields c0.

    > predp = eval $
    >   c ^ c 
    >        # (z ^ (z # combI # (succ # z)))       -- function f(z)
    >        # (a ^ x ^ c0)                         -- value v

{-# LANGUAGE Rank2Types #-}

-- List represented with R

newtype R x = R{unR :: forall w.
  -- b
  (x -> R x -> w -> w)
  -- a
  -> w
  -- result
  -> w}

nil :: R x
nil = R (\b a -> a)

-- constant type
cons :: x -> R x -> R x
cons x r = R(\b a -> b x r (unR r b a))

-- constant time
rhead :: R x -> x
rhead (R fr) = fr (\x _ _ -> x) (error "head of the empty list")

-- constant time
rtail :: R x -> R x
rtail (R fr) = fr (\_ r _ -> r) (error "tail of the empty list")

-- fold
rfold :: (x -> w -> w) -> w -> R x -> w
rfold f z (R fr) = fr (\x _ w -> f x w) z

-- zip is expressed via fold
rzipWith :: (x -> y -> z) -> R x -> R y -> R z
rzipWith f r1 r2 =  rfold f' z r1 r2
 where f' x tD = \r2 -> cons (f x (rhead r2)) (tD (rtail r2))
       z       = \_  -> nil

-- tests

toR :: [a] -> R a
toR = foldr cons nil

toL :: R a -> [a]
toL = rfold (:) []

l1 = toR [1..10]
l2 = toR "abcde"

t1 = toL $ rtail l2
-- "bcde"

t2 = toL $ rzipWith (,) l2 l1
-- [('a',1),('b',2),('c',3),('d',4),('e',5)]