[Haskell-cafe] Detecting Cycles in Datastructures

Fri Nov 18 14:28:24 EST 2005

Greg Woodhouse wrote:
 > --- Paul Hudak <paul.hudak at yale.edu> wrote:
 >>Tom Hawkins wrote:
 >> > In a pure language, is it possible to detect cycles in recursive
 >> > data structures?  For example, is it possible to determine that
 >> > "cyclic" has a loop? ...
 >> >
 >> > data Expr = Constant Int | Addition Expr Expr
 >> >
 >> > cyclic :: Expr
 >> > cyclic = Addition (Constant 1) cyclic
 >> >
 >> > Or phased differently, is it possible to make "Expr" an instance of
 >> > "Eq" such that cyclic == cyclic is smart enough to avoid a
 >> > recursive decent?
 >
 > I'm a little confused. It's one thing to demonstrate that a (possibly
 > infinite) digraph contains a loop, but quite another to show that it
 > does not. Carrying around a copy of the subgraph consisting of nodes
 > actually visited is workable in a pure language, but there's no
 > guarantee of termination.

The question that was originally posted assumed that the graph was
represented using direct recursion in Haskell.  In which case you
cannot detect a cycle, as previous replies pointed out.  Of course, if
you represented the graph in some more concrete form -- such as an
explicit list of nodes and edges -- then you could detect the cycle
using a standard graph algorithm, at the expense of losing the
elegance of the Haskell recursion.  My post was just pointing out that
there is a third possibility, one that retains most of the elegance
and abstractness of Haskell, yet still allows detecting cycles.

 >>Perhaps it's obvious, but one way to do this is to make the cycle
 >>*implicit* via some kind of a fixed-point operator, which is a trick
 >>I used recently in a DSL application.
 >
 > So, you imagine f to be a function that (non-deterministally) follows
 > an edge in a digraph. The idea being that G is (possibly infinite)
 > digraph and F is a subgraph. The "function" f would
 > non-deterministically select a vertext of F, follow an edge in G (again
 > chosen non-deterministically), add the (possibly new) edge to F, and
 > return the resulting graph. Then, you are asking if f has a fixpoint in
 > this broader context?

I don't understand this... and therefore it's probably not what I
imagine!  I'm saying simply that a cyclic graph can be represented as
the fixed point of a function.

 >>For example, you could define:
 >>
 >> > data Expr = Const Int
 >> >           | Add Expr Expr
 >> >           | Loop                      -- not exported
 >> >      deriving (Eq, Show)
 >>
 >>The purpose of Loop is to "mark" the point where a cycle exists.
 >>Instead of working with values of type Expr, you work with values of
 >>type Expr -> Expr, or Fix Expr:
 >>
 >> > type Fix a = a -> a
 >>
 >>For example:
 >>
 >> > fe1,fe2 :: Fix Expr
 >> > fe1 e = Add (Const 1) (Const 1)  -- non-recursive
 >> > fe2 e = Add (Const 1) e          -- recursive
 >>
 >>You can always get the value of an Expr from a Fix Expr whenever you
 >>need it, by computing the fixed point:
 >>
 >> > fix f = x where x = f x
 >
 > If it can be computed. Maybe I'm off-track here, but it seems to me
 > that when you introduce laziness into the equation, you lose any
 > guarantee of there even being a fixpoint, much less one that can be
 > computed.

Actually it's the other way around -- laziness is what makes this work! 
  The least fixpoint of fe2 in a strict language, for example, is bottom.

 >> > e1,e2 :: Expr
 >> > e1 = fix fe1          -- finite
 >> > e2 = fix fe2          -- infinite
 >>
 >>Note that e2 is equivalent to your "cyclic".  But now we can also
 >>test for equality:
 >>
 >> > instance Eq (Fix Expr) where
 >> >   fe1 == fe2  =  fe1 Loop == fe2 Loop
 >>
 >>For example, fe2 == fe2 returns "True", whereas e2 == e2
 >>(i.e. your cyclic == cyclic) diverges.
 >>
 >>Of course, note that, although fe1 == fe2 implies that fix fe1 == fix
 >>fe2, the reverse is not true, since there will always be loops of
 >>varying degrees of unwinding that are semantically equivalent, but
 >>not "syntactically", or structurally, equivalent.  Thus this definition
 >>of equality is only approximate, but still, it's useful.
 >
 > I'm lost. Are you saying bottom is bottom?

I suspect from your other post that you haven't seen the standard
trick of encoding infinite data structures as fixpoints.  Suppose you
have a lambda calculus term for cons, as well as for the numeral 1.
Then the infinite list of ones is just:

Y (\ones. cons 1 ones)

where Y (aka the "paradoxical combinator" or "fixed point combinator")
is defined as:

\f. (\x. f (x x)) (\x. f (x x))

To see this, try unfolding the above term a few iterations using
normal-order reduction.  Note that the term has no recursion in it 
whatsoever.

Now, my earlier point above is that:

Y (\ones. cons 1 ones)

unwinds to the same term as:

Y (\ones. cons 1 (cons 1 ones))

yet the two terms (\ones. cons 1 ones) and
(\ones. cons 1 (cons 1 ones)) are not the same (in fact they're not 
lambda convertible, either).

   -Paul