[Haskell-cafe] Detecting Cycles in Datastructures

Fri Nov 18 11:37:40 EST 2005

This is a very late response to an old thread...

Tom Hawkins wrote:
 > In a pure language, is it possible to detect cycles in recursive
 > data structures?  For example, is it possible to determine that
 > "cyclic" has a loop? ...
 >
 > data Expr = Constant Int | Addition Expr Expr
 >
 > cyclic :: Expr
 > cyclic = Addition (Constant 1) cyclic
 >
 > Or phased differently, is it possible to make "Expr" an instance of
 > "Eq" such that cyclic == cyclic is smart enough to avoid a recursive
 > decent?
 >
 > -Tom

---

Perhaps it's obvious, but one way to do this is to make the cycle
*implicit* via some kind of a fixed-point operator, which is a trick I
used recently in a DSL application.  For example, you could define:

 > data Expr = Const Int
 >           | Add Expr Expr
 >           | Loop                      -- not exported
 >      deriving (Eq, Show)

The purpose of Loop is to "mark" the point where a cycle exists.
Instead of working with values of type Expr, you work with values of
type Expr -> Expr, or Fix Expr:

 > type Fix a = a -> a

For example:

 > fe1,fe2 :: Fix Expr
 > fe1 e = Add (Const 1) (Const 1)  -- non-recursive
 > fe2 e = Add (Const 1) e          -- recursive

You can always get the value of an Expr from a Fix Expr whenever you
need it, by computing the fixed point:

 > fix f = x where x = f x

 > e1,e2 :: Expr
 > e1 = fix fe1          -- finite
 > e2 = fix fe2          -- infinite

Note that e2 is equivalent to your "cyclic".  But now we can also test
for equality:

 > instance Eq (Fix Expr) where
 >   fe1 == fe2  =  fe1 Loop == fe2 Loop

For example, fe2 == fe2 returns "True", whereas e2 == e2
(i.e. your cyclic == cyclic) diverges.

Of course, note that, although fe1 == fe2 implies that fix fe1 == fix
fe2, the reverse is not true, since there will always be loops of
varying degrees of unwinding that are semantically equivalent, but not
"syntactically", or structurally, equivalent.  Thus this definition of
equality is only approximate, but still, it's useful.

If you want to have multiple loops, or a loop not at the top level,
then you need to add some kind of a loop constructor to Expr.  I've
sketched that idea below, and pointed out a couple of other useful
ideas, such as showing a loop, and mapping a function over a loop
without unwinding it.

I hope this helps,

   -Paul

----------------------------------------------------------------------

 > module Cyclic where

Loop now needs an ID because there may be more than one of them.

 > data Expr = Const Int
 >           | Add Expr Expr
 >           | Rec (Fix Expr)            -- implicit loop
 >           | Loop ID                   -- not exported
 >
 > type Fix a = a -> a
 > type ID    = Int

To check equality of and show Exprs, we need to supply unique IDs to
each recursive loop, which we do via a simple counter.

 > instance Eq Expr where
 >   e1 == e2  =
 >     let eq (Const x) (Const y)       n  =  x == y
 >         eq (Loop i1) (Loop i2)       n  =  i1 == i2
 >         eq (Add e1 e2) (Add e1' e2') n  =  eq e1 e1' n && eq e2 e2' n
 >         eq (Rec fe1) (Rec fe2)       n  =  eq (fe1 (Loop n))
 >                                               (fe2 (Loop n)) (n+1)
 >         eq _ _                       n  =  False
 >     in  eq e1 e2 0
 >
 > instance Show Expr where
 >   show e  =
 >     let show' (Const x)   n =  "(Const " ++ show x ++ ")"
 >         show' (Add e1 e2) n =  "(Add " ++ show' e1 n ++ " "
 >                                        ++ show' e2 n ++ ")"
 >         show' (Loop i)    n =  "(Loop " ++ show i ++ ")"
 >         show' (Rec fe)    n =  "(Rec " ++ show n ++ " "
 >                                        ++ show' (fe (Loop n)) (n+1)
 >                                        ++ ")"
 >     in  show' e 0
 >

Unwinding (i.e. computing fixpoints):

Note: unwind should never see a Loop constructor.

 > unwind :: Expr -> Expr
 > unwind (Add e1 e2) = Add (unwind e1) (unwind e2)
 > unwind (Rec fe)    = x where x = unwind (fe x)
 > unwind e           = e

The 2nd equation above is analogous to:
fix f = x where x = f x
And these two equations could also be written as:
fix f = f (fix f)
unwind (Rec fe) = unwind (fe (Rec fe))

Examples:

 > fe1,fe2 :: Fix Expr
 > fe1 e = Add (Const 1) (Const 1)  -- non-recursive
 > fe2 e = Add (Const 1) e          -- recursive
 >
 > e1,e2,e3 :: Expr
 > e1 = Rec fe1                                 -- no real loop
 > e2 = Rec fe2                                 -- top-level loop
 > e3 = Add e2 (Const 0)                        -- lower-level loop
 > e4 = Rec (\e1-> Add (Const 1)
 >                     (Rec (\e2-> Add e1 e2))) -- nested loop
 >
 > b1,b2 :: Bool
 > b1 = e3==e3           -- returns True
 > b2 = e3==e2           -- returns False
 >
 > e1u,e2u,e3u :: Expr
 > e1u = unwind e1       -- finite
 > e2u = unwind e2       -- infinite
 > e3u = unwind e3       -- infinite
 > e4u = unwind e4       -- infinite

Now suppose we define a function, say mapE, that applies a function to
the leaves (in this case the Const Int's) of an Expr.  For example:

mapE succ (Add (Const 1) (Const 2)) => Add (Const 2) (Const 3)

Then if we define something like:

cyclic1 = Add (Const 1) cyclic1

and do "mapE succ cyclic", we'd like to get:

cyclic2 = Add (Const 2) cyclic2

However, most implementations will unwind the loop -- i.e. we don't
get the sharing implied by the explicit loop.  However, by using Rec
to express loops implicitly, we can get around this as follows:

 > mapE :: (Int->Int) -> Expr -> Expr
 > mapE f e = mapE' f e 0 where
 >   mapE' f (Const i)   n = Const (f i)
 >   mapE' f (Add e1 e2) n = Add (mapE' f e1 n) (mapE' f e2 n)
 >   mapE' f (Rec fe)    n = Rec (absLoop n (mapE' f (fe (Loop n)) (n+1)))
 >   mapE' f (Loop i)    n = Loop i

absLoop n e turns e :: Expr into a Fix Expr by "abstracting out" Loop n:

 > absLoop :: Int -> Expr -> Fix Expr
 > absLoop n e = \e' ->
 >    let abs (Loop n') | n==n' = e'
 >        abs (Add e1 e2)       = Add (abs e1) (abs e2)
 >        abs e                 = e
 >    in abs e

Examples:

 > e5 = mapE succ e2
 > e6 = Rec (\e -> Add (Const 2) e)
 > b3 = e5==e6   -- returns True!
 >
 > e7 = mapE succ e4
 > e8 = Rec (\e1-> Add (Const 2)
 >                     (Rec (\e2-> Add e1 e2)))
 > b4 = e7==e8   -- returns True!