[Haskell-cafe] Re: MonadFix

Fri Dec 21 05:33:18 EST 2007

Joost Behrends wrote:
> apfelmus writes:
> 
>> How about separating the candidate prime numbers from the recursion
>>
>>    factorize :: Integer -> [Integer]
>>    factorize = f primes'
>>       where
>>       primes' = 2:[3,5..]
>>       f (p:ps) n
>>          | r == 0    = p : f (p:ps) q
>>          | p*p > n   = [n]
>>          | otherwise = f ps n
>>          where
>>          (q,r) = n `divMod` p

> (besides: p < intsqrt n must stay, otherwise you have
> the expensive p*p at every step)

Huh?  p < intsqrt n  is evaluated just as often as  p*p > n , with 
changing  n  . Why would that be less expensive? Btw, the code above 
test for  r==0  first, which means that the following  p*p > n  is 
tested exactly once for every prime candidate  p .

> Providing effectively primes' for that is simply impossible
> talking about really big numbers 
> as i did in my post. There are no fast generators iterating just
> through the primes firstly

Sure, that's why I called it  primes' . It's indented to be an easily 
computable list of prime candidates and as you write below, we can do 
better than using all odd numbers for that.

> and these lists get much too big also 
> (for 2^120 you cannot even begin to use a list of the primes 
> up to 2^(any appropriate x) ).

Thanks to lazy evaluation, the list will be generated on demand and its 
elements are garbage collect once used. So, the code above will run in 
constant space. The list is more like a suspended loop.

> What can be done is to iterate through odd numbers meeting as many primes 
> as possible. We could do this:
> 
> iterdivisors x | x == 0 = 3
>                | x == 1 = 5
>                | otherwise x = iterdivisors (x-1) + ((cycle [2,4]) !! x)
> 
> This gives 7,11,13,17,19,23,25,29,31,35,37,41,43,47,49,53,55,59,61,63,67 ...
> 
> i.e. exactly all primes and odds with greater primefactors as 3.
> We can improve that cycle avoiding the multiples of 5:
> 
>  ... | otherwise x = iterdivisors (x-1) + ((cycle [2,4,2,4,2,4,6,2,6] !! x)
> 
> and we can do better by avoiding the multiples of 7 and so on
>
> (the length of these lists grows fast - it gets multiplied 
> by every new avoided prime -, but we could provide that lists 
> programmatically). And we must be sure, that cycle
> doesn't eat up memory for each new pass through the list. 
> And we should use a more efficient representaion 
> for the list of summands than a list.

Huh, this looks very expensive. I'd avoid indices like  x  altogether 
and use a plain list instead, we don't need random access to the prime 
candidates, after all.

> But the title of my post and much more interesting topic 
> for learning Haskell is, how to avoid memory exhaustion by recursion.

Maybe you stumbled over common examples for a stack overflow like

   foldr (+) 0
   foldl (+) 0

whereas

   foldl' (+) 0

runs without. See also

   http://en.wikibooks.org/wiki/Haskell/Performance_Introduction
   http://www.haskell.org/haskellwiki/Stack_overflow

> THIS was my intention and the reason why i erroneously brought MonadFix 
> into the game. The recursion i described as follows
> 
>> divisions = do
>>    y <- get
>>    if divisor y <= bound y then do
>>        put ( divstep y )
>>        divisions
>>        else return y
> 
> makes a DESTRUCTIVE UPDATE of the DivIters (by put)

Huh? The  State  monad doesn't do destructive updates, to the contrary. 
(neither do IORefs or STRefs, only STArrays or something do).

> and this kind of recursion
> seems not to "remember" itself (as i have understood, that is achieved by 
> "tail recursion"). I just didn't like making DivIters to States. 
> It's kind of lying code.

Because of lazy evaluation, tail recursion is not what it seems to be in 
Haskell.

> However it worked and improved performance by a factor around 10
> (or oo - perhaps a normal recursion exhausts 512MB memory for 2^120+1, 
> as it will do for much smaller Integers, if they are prime) 
> not to talk about footprint. Compiled for running standalone, 
> it took 17 minutes, an equivalent python script 2 hours.
> This factor near 7 is not fully satisfactory. 

Using the  State  monad introduces unnecessary overhead. Also, I assume 
that you ran the compiler with the  -O2  flag to enable optimizations?

> Or is there still a way of getting a tail recursive Haskell function 
> for iterating through the DivIters (outside any monads) ?? 
> I would not get stroke dead by surprise if yes, but i couldn't find any.

I don't understand why you use a complicated  DivIters  data structure. 
Passing two simple parameters does the job just fine.

Regards,
apfelmus