[Haskell-cafe] Detecting Cycles in Datastructures

Andrew Pimlott andrew at pimlott.net
Sun Nov 27 05:07:14 EST 2005


On Fri, Nov 18, 2005 at 11:37:40AM -0500, Paul Hudak wrote:
> This is a very late response to an old thread...

ditto :-)

> > unwind :: Expr -> Expr
> > unwind (Add e1 e2) = Add (unwind e1) (unwind e2)
> > unwind (Rec fe)    = x where x = unwind (fe x)
> > unwind e           = e

Since this discussion started around observing sharing in the
implementation, I wanted to see whether, by the time we convert your
cunning representation back into an infinite data structure, we have the
sharing we hoped to observe in the first place.

> > fe2 e = Add (Const 1) e          -- recursive
> > e2 = Rec fe2                                 -- top-level loop
> > e2u = unwind e2       -- infinite

    main = walk e4u where
      walk (Add e1 e2) = walk e2 

blows up, showing that we do not.  The problem with unwind seems to be
that the computation of the fixed point keeps calling unwind, which
keeps reconstructing the fixed point:

    e2u = unwind e2
        = x where x = unwind ((\e -> Add (Const 1) e) x)
                    = Add (Const 1) (unwind x)
                    = Add (Const 1) (Add (Const 1) (unwind (unwind x)))
                    = ...

I then tried

    unwind (Rec fe)    = unwind (fix fe)
    fix f = x where x = f x

even though I didn't think it would work: (fix fe) would create a
properly sharing structure, but unwind would unshare it:

    e2u = unwind (x where x = ((\e -> Add (Const 1) e) x))
                 (          = Add (Const 1) x)
        = Add (Const 1) (unwind (x where x = Add (Const 1) x))
        = Add (Const 1) (Add (Const 1) (unwind (x where x = Add (Const 1) x)))
        = ...

This does blow up in ghci, but not in ghc (6.4.1), even without
optimization.  I'm not quite sure why, but anyway I want a version that
exhibits sharing even in any reasonable implementation.

Your message gives the technique (in mapE); we only have to apply it to
unwind.  But there is a problem--your code has a bug!

> > mapE :: (Int->Int) -> Expr -> Expr
> > mapE f e = mapE' f e 0 where
> >   mapE' f (Const i)   n = Const (f i)
> >   mapE' f (Add e1 e2) n = Add (mapE' f e1 n) (mapE' f e2 n)
> >   mapE' f (Rec fe)    n = Rec (absLoop n (mapE' f (fe (Loop n)) (n+1)))
> >   mapE' f (Loop i)    n = Loop i
> 
> > absLoop :: Int -> Expr -> Fix Expr
> > absLoop n e = \e' ->
> >    let abs (Loop n') | n==n' = e'
> >        abs (Add e1 e2)       = Add (abs e1) (abs e2)
> >        abs e                 = e
> >    in abs e
>
> > e4 = Rec (\e1-> Add (Const 1)
> >                     (Rec (\e2-> Add e1 e2))) -- nested loop
> > e7 = mapE succ e4
> > e8 = Rec (\e1-> Add (Const 2)
> >                     (Rec (\e2-> Add e1 e2)))
> > b4 = e7==e8   -- returns True!

Notice that absLoop does not look inside Rec.  But there could be a Loop
(with the right n) there!  e7 is actually

    Rec (\e1-> Add (Const 2)
                   (Rec (\e2-> Add (Loop 0) e2)))

We might also cast a suspicious eye at (==), which spuriously returned
True!

Really, we want absLoop to eliminate all the Loops it can find.  But it
can't, because it only knows the replacement expression for one Loop.
It would be simpler for the Loop just to contain the expression.  To
enable that, I added a constructor Stop that is like Loop, except it
takes an Expr instead of an Int.  I use this constructor for my sharing
unwind as well; Loop is only needed for (==).  (It would probably be
even better to add an annotation type argument to Expr; this could
enable stricter typechecking that would have caught the bug.)

Complete code:

> {-# OPTIONS -fglasgow-exts #-}
> 
> data Expr = Const Int
>           | Add Expr Expr
>           | Rec (Fix Expr)            -- implicit loop
>           | Loop ID                   -- not exported
>           | Stop Expr
> 
> type Fix a = a -> a
> type ID    = Int
> 
> instance Eq Expr where
>   e1 == e2  =
>     let eq (Const x) (Const y)       n  =  x == y
>         eq (Loop i1) (Loop i2)       n  =  i1 == i2
>         eq (Add e1 e2) (Add e1' e2') n  =  eq e1 e1' n && eq e2 e2' n
>         eq (Rec fe1) (Rec fe2)       n  =  eq (fe1 (Loop n))
>                                               (fe2 (Loop n)) (n+1)
>         eq _ _                       n  =  False
>     in  eq e1 e2 0
> 
> unwind :: Expr -> Expr
> unwind e = absStop (unwind' e) where
>   unwind' (Add e1 e2) = Add (unwind' e1) (unwind' e2)
>   unwind' (Rec fe)    = Stop e where e = absStop (unwind' (fe (Stop e)))
>   unwind' e           = e
> 
> mapE :: (Int->Int) -> Expr -> Expr
> mapE f e = mapE' e where
>   mapE' (Const i)   = Const (f i)
>   mapE' (Add e1 e2) = Add (mapE' e1) (mapE' e2)
>   mapE' (Rec fe)    = Rec (\e -> absStop (mapE' (fe (Stop e))))
>   mapE' e@(Stop _)  = e

Replacement for absLoop that removes all Stops, unlike absLoop which
only removed the Loops that its caller owned.

> absStop (Stop e)      = e
> absStop (Add e1 e2)   = Add (absStop e1) (absStop e2)
> absStop e             = e

The mapE examples still work ...

> e4 = Rec (\e1-> Add (Const 1)
>                     (Rec (\e2-> Add e1 e2))) -- nested loop
> e4u = unwind e4       -- infinite
> e7 = mapE succ e4
> e8 = Rec (\e1-> Add (Const 2)
>                     (Rec (\e2-> Add e1 e2)))
> b4 = e7==e8   -- returns True!

... and we verify that we didn't leave behind any Loops/Stops.

> hasLoop (Const _)   = False
> hasLoop (Add e1 e2) = hasLoop e1 || hasLoop e2
> hasLoop (Rec fe)    = hasLoop (fe (Const 1))
> hasLoop (Loop _)    = True
> hasLoop (Stop _)    = True
>
> hasLoop e7  -- False

Unwound infinite data structures (even with nested loops) exhibit
sharing: main runs in constant space.

> main = walk e4u where
>   walk (Add e1 e2) = walk e2

Andrew


More information about the Haskell-Cafe mailing list