[Haskell-cafe] Indirect Cycle Detection problem [was: finding "good work" in CS]
Bertram Felgenhauer
bertram.felgenhauer at googlemail.com
Fri Dec 27 01:13:06 UTC 2013
Hi Thomas and Richard,
> > [...]
> >Here's a piece of computer science that I would like some help
> >with. I call it the Indirect Cycle Detection problem.
> >
> >Given: Domains P and E,
> > functions f : P -> Maybe P
> > g : P -> E
> >
> >Define to_list :: Maybe P -> [E]
> > to_list Nothing = []
> > to_list (Just p) = g p : to_list (f p)
> >
> >Given: That f is cyclic starting at p0.
> >
> >Find: The shortest alpha, beta such that
> > to_list p0 is alpha ++ cycle beta
> > and do so *efficiently*.
> >
> >Now, I can use tortoise-and-hare to find a cycle in f
> >and then use brute force to find a shortest prefix and
> >cycle of to_list ... The stuff I've checked so far
> >about periods in strings has nothing to say about
> >periods that begin _after_ a non-empty prefix.
As Thomas explained, that's sufficient. The tortoise-and-hare algorithm
will identify a tail of the sequence that is known to be periodic, and
then we can find the smallest period using an algorithm that operates on
a finite string. Once the actual cycle length is known, identifying the
prefix is easy.
> Also, I can not see where this "non-empty prefix" notion comes from.
> Perhaps you have a different definition for cyclic?
See http://en.wikipedia.org/wiki/Cycle_detection .
> In this case, ps has the form (prefix ++ (cycle p_period_n)) and to
> find this structure, you need to find the first element of ps that
> occurs twice. (Before, you knew this would be p0.) For that, you
> need a set implementation for elements of P that gives you efficient
> adding of one element and lookup. Since we know nothing about P's
> elements, we use the list itself and get a runtime of
> Theta((i+n)^2).
The tortoise-and-hare algorithm does this in O(i+n) time.
I'm attaching some code. For simplicity, it works with functions
@f :: a -> a@, @g :: a -> b@, which define the sequence
@xs = map g (iterate f x)@. I'm using Brent's algorithm [1] for cycle
detection, which finds the period and a periodic suffix of @iterate f x@,
followed by Duval's algorithm [2] to identify the actual cycle length
of @xs at . (The main advantage of Duval's algorithm over Knuth-Morris-Pratt
is that less space is required. One disadvantage is that we need a total
order on @b at .) The total running time is linear in the cycle length and
initial prefix length of @iterate f x@, i.e. O(i+n).
There are two entry points of interest in the code.
- factor f g x
Find prefix and repeated segment of @map g (iterate f x)@.
- factor' f g x
Find prefix and repeated segment of @to_list x at .
The repeated segment will be empty if @to_list x@ is a finite list.
Cheers,
Bertram
[1] http://en.wikipedia.org/wiki/Cycle_detection
[2] http://stackoverflow.com/questions/3459509/minimal-cyclic-shift-algorithm-explanation
-------------- next part --------------
A non-text attachment was scrubbed...
Name: duval.hs
Type: text/x-haskell
Size: 1952 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20131227/87eb9376/attachment.hs>
More information about the Haskell-Cafe
mailing list