[Haskell-cafe] Indirect Cycle Detection problem [was: finding "good work" in CS]

Fri Dec 27 01:13:06 UTC 2013

Hi Thomas and Richard,

> > [...]
> >Here's a piece of computer science that I would like some help
> >with.  I call it the Indirect Cycle Detection problem.
> >
> >Given:  Domains P and E,
> >         functions f : P -> Maybe P
> >                   g : P -> E
> >
> >Define  to_list :: Maybe P -> [E]
> >         to_list Nothing = []
> >         to_list (Just p) = g p : to_list (f p)
> >
> >Given:  That f is cyclic starting at p0.
> >
> >Find:   The shortest alpha, beta such that
> >	to_list p0    is   alpha ++ cycle beta
> >         and do so *efficiently*.
> >
> >Now, I can use tortoise-and-hare to find a cycle in f
> >and then use brute force to find a shortest prefix and
> >cycle of to_list ... The stuff I've checked so far
> >about periods in strings has nothing to say about
> >periods that begin _after_ a non-empty prefix.

As Thomas explained, that's sufficient. The tortoise-and-hare algorithm
will identify a tail of the sequence that is known to be periodic, and
then we can find the smallest period using an algorithm that operates on
a finite string. Once the actual cycle length is known, identifying the
prefix is easy.

> Also, I can not see where this "non-empty prefix" notion comes from.
> Perhaps you have a different definition for cyclic?

See http://en.wikipedia.org/wiki/Cycle_detection . 

> In this case, ps has the form (prefix ++ (cycle p_period_n)) and to
> find this structure, you need to find the first element of ps that
> occurs twice. (Before, you knew this would be p0.) For that, you
> need a set implementation for elements of P that gives you efficient
> adding of one element and lookup. Since we know nothing about P's
> elements, we use the list itself and get a runtime of
> Theta((i+n)^2).

The tortoise-and-hare algorithm does this in O(i+n) time.

I'm attaching some code. For simplicity, it works with functions
@f :: a -> a@, @g :: a -> b@, which define the sequence
@xs = map g (iterate f x)@. I'm using Brent's algorithm [1] for cycle
detection, which finds the period and a periodic suffix of @iterate f x@,
followed by Duval's algorithm [2] to identify the actual cycle length
of @xs at . (The main advantage of Duval's algorithm over Knuth-Morris-Pratt
is that less space is required. One disadvantage is that we need a total
order on @b at .) The total running time is linear in the cycle length and
initial prefix length of @iterate f x@, i.e. O(i+n).

There are two entry points of interest in the code.

- factor f g x
  Find prefix and repeated segment of @map g (iterate f x)@.
- factor' f g x
  Find prefix and repeated segment of @to_list x at .
  The repeated segment will be empty if @to_list x@ is a finite list.

Cheers,

Bertram

[1] http://en.wikipedia.org/wiki/Cycle_detection
[2] http://stackoverflow.com/questions/3459509/minimal-cyclic-shift-algorithm-explanation
-------------- next part --------------
A non-text attachment was scrubbed...
Name: duval.hs
Type: text/x-haskell
Size: 1952 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20131227/87eb9376/attachment.hs>