[Haskell-beginners] Question about time consume when calculate prime numbers

Thu Sep 13 00:32:13 CEST 2012

You might also be interested in this paper:
www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf . I thought it was a good
analysis/

On Wed, Sep 12, 2012 at 8:53 AM, Yi Cheng <chengyidna at gmail.com> wrote:
> Thank you, very much. It's exactly the thing I want to know.
>
> Yi. Cheng
>
>
> On Wed, Sep 12, 2012 at 8:41 PM, Lorenzo Bolla <lbolla at gmail.com> wrote:
>>
>>
>>
>> On Wed, Sep 12, 2012 at 1:17 PM, Yi Cheng <chengyidna at gmail.com> wrote:
>>>
>>>
>>> Thanks for answering my question, but I'm still confused by some details.
>>> I don't quite agree with you that Eratosthenes algorithm must be
>>> implemented with a complexity of O(n^2) in space. When the n is used to
>>> calculate the primes below it, it can be implemented in space complexity
>>> O(n). For example, in languages, like C/C++, we can allocate a array. So I
>>> think the the complexity of O(n^2) in space you mentioned, is the complexity
>>> of "the beautiful code". So here's the question, can Eratosthenes algorithm
>>> be implemented in a more gentle way?
>>
>>
>> Correct: I referred to your implementation. See here
>> (http://www.haskell.org/haskellwiki/Prime_numbers#Sieve_of_Eratosthenes) for
>> many different (and more efficient) implementations of Eratosthenes.
>>
>>
>>> Then I think maybe there is a more beautiful and practical way to
>>> implement it.
>>> One method of mine is trying to judge whether a number is a prime just by
>>> the primes less than it, such as if the greatest common divisor of the
>>> number and the product of the primes less than it equals to 1. But the
>>> product of the primes is too large.
>>>
>>> So I wander if there is a concise method to solve the problem with a
>>> faster method. In my C++ version, the Eratosthenes is implemented in linear
>>> space complexity, and optimize in filtering the numbers which can be divided
>>> by a prime. This code is faster than the original algorithm implemented by
>>> me(It was also implemented it in C++, and slower than the following code).
>>> I know, when writing Haskell code, it would be better to forget some
>>> experience in command-line language, but I want to know whether there is a
>>> faster method to solve the problem.
>>>
>>>
>>> Thank you.
>>> Yi. Cheng
>>>
>>> The code in my c++ version.
>>> #include <iostream>
>>> using namespace std;
>>> int main(){
>>>     int p[2000000] = {0};
>>>     long sum = 0;
>>>     int f = 1;
>>>     for(long i=2; i <= 2000000; ++i){
>>>         if(p[i] == 0){
>>>             sum += i;
>>>             for(long j = i * i; j < 2000000; j += i)
>>>                 p[j] = 1;
>>>         }
>>>     }
>>>     cout<<sum<<endl;
>>>     return 0;
>>>
>>> }
>>>
>>
>> This implementation looks like the one here:
>> http://www.haskell.org/haskellwiki/Prime_numbers#From_Squares, with the
>> difference that in C++ you are modifying your array in-place instead of
>> generating it lazily. It's not equivalent to your "beautiful" version in
>> Haskell.
>>
>> hth,
>> L.
>>
>>
>>
>>
>>
>>>
>>> On Wed, Sep 12, 2012 at 5:26 PM, Lorenzo Bolla <lbolla at gmail.com> wrote:
>>>>
>>>>
>>>>
>>>> On Wed, Sep 12, 2012 at 9:06 AM, Yi Cheng <chengyidna at gmail.com> wrote:
>>>>>
>>>>> Recently, I'm trying to solve some problems in project euler using
>>>>> haskell. When it came to problem 10, calculating the sum of all primes below
>>>>> 20000000, I try to write a program which can generate primes.
>>>>> In my memory Eratosthenes is faster than just whether a number can be
>>>>> divided by the number less then the square root of it.
>>>>> Firstly, I wrote the following programs:
>>>>>
>>>>> module Main where
>>>>> isPrime x = isPrime' 3 x (round . sqrt. fromIntegral $ x)
>>>>> isPrime' d target maxd
>>>>>   | d > maxd = True
>>>>>   | mod target d == 0 = False
>>>>>   | otherwise = isPrime' (d + 2) target maxd
>>>>>
>>>>> main = print $ (sum (filter isPrime [3,5..2000000]) + 2)
>>>>>
>>>>> And it consume about 11s in my computer.
>>>>> Then, I tried to figure out how to solve the problem by Eratosthenes,
>>>>> but failed. Later, I find a program implemented by others, meeting my
>>>>> purpose and I've used it to solve the problem:
>>>>>
>>>>> primes :: [Int]
>>>>> primes = primes' [2..]
>>>>>
>>>>> primes' :: [Int] -> [Int]
>>>>> primes' [] = []
>>>>> primes' (n:ns) = n : primes' (filter (\v -> v `mod` n /= 0) ns)
>>>>>
>>>>> solve x = sum $ primes' [2..x]
>>>>>
>>>>> main = print $ solve 2000000
>>>>>
>>>>> Well, although the code is beautiful, it is slow. Even waiting for a
>>>>> minute, no answer was printed.
>>>>>
>>>>> In C version, Eratosthenes is faster than the method implemented in my
>>>>> earlier code, which only consume 0.3s(the earlier method consume 1.6s).
>>>>>
>>>>> So I want to know, why Eratosthenes implemented in Haskell is slow than
>>>>> the ugly code implemented by me.
>>>>> Could anyone tell me?
>>>>>
>>>>
>>>> Eratosthenes's complexity is O(n^2) (both space and time), whereas the
>>>> "ugly" one has a sub-quadratic running complexity and linear in space.
>>>>
>>>> Try to profile them:
>>>> $> ghc -O2 --make -prof -auto-all <filename>
>>>> $> ./primes +RTS -p -hc
>>>> $> hp2ps primes.hp
>>>>
>>>> You'll see that most of the time is spent allocating space which is
>>>> never released.
>>>> You could play a bit with strictness, but the main problem is the awful
>>>> complexity of the algorithm.
>>>>
>>>> hth,
>>>> L.
>>>>
>>>>
>>>>
>>>>>
>>>>> Thank you
>>>>> Yi Cheng
>>>>>
>>>>> _______________________________________________
>>>>> Beginners mailing list
>>>>> Beginners at haskell.org
>>>>> http://www.haskell.org/mailman/listinfo/beginners
>>>>>
>>>>
>>>
>>
>
>
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/mailman/listinfo/beginners
>