[Haskell-cafe] Indirect Cycle Detection problem [was: finding "good work" in CS]

Thu Dec 19 17:16:17 UTC 2013

Hi Richard,

Am 09.12.2013 04:00, schrieb Richard A. O'Keefe:
>
 > [...]
> Here's a piece of computer science that I would like some help
> with.  I call it the Indirect Cycle Detection problem.
>
> Given:  Domains P and E,
>          functions f : P -> Maybe P
>                    g : P -> E
>
> Define  to_list :: Maybe P -> [E]
>          to_list Nothing = []
>          to_list (Just p) = g p : to_list (f p)
>
> Given:  That f is cyclic starting at p0.
>
> Find:   The shortest alpha, beta such that
> 	to_list p0    is   alpha ++ cycle beta
>          and do so *efficiently*.
>
> Now, I can use tortoise-and-hare to find a cycle in f
> and then use brute force to find a shortest prefix and
> cycle of to_list ... The stuff I've checked so far
> about periods in strings has nothing to say about
> periods that begin _after_ a non-empty prefix.
>

I am not sure if you're really looking for help here or just wanted to 
present an example. I assume the former.

Also, I can not see where this "non-empty prefix" notion comes from. 
Perhaps you have a different definition for cyclic?

For me: "f is cyclic starting at p0" means there exist an N>0 such that 
f^N p0 == p0. Let's assume n to be the smallest number that fulfills 
this condition and call it the period length of f_p0.

f_p0 defines a periodic list ps with the same period length n.

ps = map fromJust $ iterate (>>= f) (Just p0)

Your (to_list p0), which I'd like to call es could now be written as:
es = map g ps

We know that es has a period of length n, but if g is not injective, its 
smallest period of length k may be smaller than n.

period_n = take n es

To find k, we search for the second occurrence of period_n in (period_n 
++ period_n), knowing that it will be at position n, if it is not found 
at positions [1..n-1]. We might use the fact that it can only be in 
positions that divide n. On the other hand, we could just use any of the 
well known string-search-methods.

Efficiency:

- If computation of f is done in constant time, finding n takes Theta(n) 
time. Just apply f until you reach p0 again.

- Searching a pattern of size n in a text of size 2n can be done in O(n) 
using the Knuth-Morris-Pratt-algorithm.

In total you have a linear algorithm for finding your result:
alpha = []
beta = take k es
      = take k period_n

*****

Of course, I guess now you're going to tell me that your "non-empty 
prefix" means that the period of f is not starting at p0 but at some pi 
that is reached by applying f i times to p0. You can adapt to that, but 
you have to pay for it ;-)

In this case, ps has the form (prefix ++ (cycle p_period_n)) and to find 
this structure, you need to find the first element of ps that occurs 
twice. (Before, you knew this would be p0.) For that, you need a set 
implementation for elements of P that gives you efficient adding of one 
element and lookup. Since we know nothing about P's elements, we use the 
list itself and get a runtime of Theta((i+n)^2).

Finding the (smallest) period k in es is exactly the same as before, 
i.e. we look for the second occurrence of period_n in (period_n ++ 
period_n) with
period_n = map g p_period_n

The prefix does not bother us here, because we know that we are looking 
at the periodic part.

Now that we know the period length k, we still have to find where it 
starts. For that we start at position i and go backwards until we find a 
difference to the found out period. That is, we find the position of the 
first mismatch between (reverse $ take i es) and (cycle (reverse $ take 
k period_n)).
The index tells us the length of alpha and the rotation needed to get 
beta from (take k period_n).
The cost for that is O(i).

So, the total cost in this case would be O((i+n)^2 + n + i).

I could have implemented this in Haskell, but since the point in the 
original discussion was about not needing programmer skills, but CS 
skills, I refrained from it. Haskell is only used to clarify points, in 
a language which both he recipient and the recipient know. :-)

Thomas