# [Haskell-cafe] Detecting Cycles in Datastructures

Andrew Pimlott andrew at pimlott.net
Sun Nov 27 05:07:14 EST 2005

```On Fri, Nov 18, 2005 at 11:37:40AM -0500, Paul Hudak wrote:
> This is a very late response to an old thread...

ditto :-)

> > unwind :: Expr -> Expr
> > unwind (Add e1 e2) = Add (unwind e1) (unwind e2)
> > unwind (Rec fe)    = x where x = unwind (fe x)
> > unwind e           = e

Since this discussion started around observing sharing in the
implementation, I wanted to see whether, by the time we convert your
cunning representation back into an infinite data structure, we have the
sharing we hoped to observe in the first place.

> > fe2 e = Add (Const 1) e          -- recursive
> > e2 = Rec fe2                                 -- top-level loop
> > e2u = unwind e2       -- infinite

main = walk e4u where
walk (Add e1 e2) = walk e2

blows up, showing that we do not.  The problem with unwind seems to be
that the computation of the fixed point keeps calling unwind, which
keeps reconstructing the fixed point:

e2u = unwind e2
= x where x = unwind ((\e -> Add (Const 1) e) x)
= Add (Const 1) (unwind x)
= ...

I then tried

unwind (Rec fe)    = unwind (fix fe)
fix f = x where x = f x

even though I didn't think it would work: (fix fe) would create a
properly sharing structure, but unwind would unshare it:

e2u = unwind (x where x = ((\e -> Add (Const 1) e) x))
(          = Add (Const 1) x)
= Add (Const 1) (unwind (x where x = Add (Const 1) x))
= ...

This does blow up in ghci, but not in ghc (6.4.1), even without
optimization.  I'm not quite sure why, but anyway I want a version that
exhibits sharing even in any reasonable implementation.

Your message gives the technique (in mapE); we only have to apply it to
unwind.  But there is a problem--your code has a bug!

> > mapE :: (Int->Int) -> Expr -> Expr
> > mapE f e = mapE' f e 0 where
> >   mapE' f (Const i)   n = Const (f i)
> >   mapE' f (Add e1 e2) n = Add (mapE' f e1 n) (mapE' f e2 n)
> >   mapE' f (Rec fe)    n = Rec (absLoop n (mapE' f (fe (Loop n)) (n+1)))
> >   mapE' f (Loop i)    n = Loop i
>
> > absLoop :: Int -> Expr -> Fix Expr
> > absLoop n e = \e' ->
> >    let abs (Loop n') | n==n' = e'
> >        abs (Add e1 e2)       = Add (abs e1) (abs e2)
> >        abs e                 = e
> >    in abs e
>
> > e4 = Rec (\e1-> Add (Const 1)
> >                     (Rec (\e2-> Add e1 e2))) -- nested loop
> > e7 = mapE succ e4
> > e8 = Rec (\e1-> Add (Const 2)
> >                     (Rec (\e2-> Add e1 e2)))
> > b4 = e7==e8   -- returns True!

Notice that absLoop does not look inside Rec.  But there could be a Loop
(with the right n) there!  e7 is actually

(Rec (\e2-> Add (Loop 0) e2)))

We might also cast a suspicious eye at (==), which spuriously returned
True!

Really, we want absLoop to eliminate all the Loops it can find.  But it
can't, because it only knows the replacement expression for one Loop.
It would be simpler for the Loop just to contain the expression.  To
enable that, I added a constructor Stop that is like Loop, except it
takes an Expr instead of an Int.  I use this constructor for my sharing
unwind as well; Loop is only needed for (==).  (It would probably be
even better to add an annotation type argument to Expr; this could
enable stricter typechecking that would have caught the bug.)

Complete code:

> {-# OPTIONS -fglasgow-exts #-}
>
> data Expr = Const Int
>           | Rec (Fix Expr)            -- implicit loop
>           | Loop ID                   -- not exported
>           | Stop Expr
>
> type Fix a = a -> a
> type ID    = Int
>
> instance Eq Expr where
>   e1 == e2  =
>     let eq (Const x) (Const y)       n  =  x == y
>         eq (Loop i1) (Loop i2)       n  =  i1 == i2
>         eq (Add e1 e2) (Add e1' e2') n  =  eq e1 e1' n && eq e2 e2' n
>         eq (Rec fe1) (Rec fe2)       n  =  eq (fe1 (Loop n))
>                                               (fe2 (Loop n)) (n+1)
>         eq _ _                       n  =  False
>     in  eq e1 e2 0
>
> unwind :: Expr -> Expr
> unwind e = absStop (unwind' e) where
>   unwind' (Rec fe)    = Stop e where e = absStop (unwind' (fe (Stop e)))
>   unwind' e           = e
>
> mapE :: (Int->Int) -> Expr -> Expr
> mapE f e = mapE' e where
>   mapE' (Const i)   = Const (f i)
>   mapE' (Rec fe)    = Rec (\e -> absStop (mapE' (fe (Stop e))))
>   mapE' e@(Stop _)  = e

Replacement for absLoop that removes all Stops, unlike absLoop which
only removed the Loops that its caller owned.

> absStop (Stop e)      = e
> absStop e             = e

The mapE examples still work ...

> e4 = Rec (\e1-> Add (Const 1)
>                     (Rec (\e2-> Add e1 e2))) -- nested loop
> e4u = unwind e4       -- infinite
> e7 = mapE succ e4
> e8 = Rec (\e1-> Add (Const 2)
>                     (Rec (\e2-> Add e1 e2)))
> b4 = e7==e8   -- returns True!

... and we verify that we didn't leave behind any Loops/Stops.

> hasLoop (Const _)   = False
> hasLoop (Add e1 e2) = hasLoop e1 || hasLoop e2
> hasLoop (Rec fe)    = hasLoop (fe (Const 1))
> hasLoop (Loop _)    = True
> hasLoop (Stop _)    = True
>
> hasLoop e7  -- False

Unwound infinite data structures (even with nested loops) exhibit
sharing: main runs in constant space.

> main = walk e4u where
>   walk (Add e1 e2) = walk e2

Andrew
```