[Haskell-beginners] stack overflow summing numbers read from a big file

Graham Gill math.simplex at gmail.com
Sat Mar 23 16:41:14 CET 2013


I think the problem is with laziness in the accumulator of "sum".

The prelude "sum" is defined as

sum     l       = sum' l 0
   where
     sum' []     a = a
     sum' (x:xs) a = sum' xs (a+x)

For example, if in GHCi I execute

> sum [1..50000000]
<interactive>: out of memory

I get an out of memory exception.

On the other hand, if I enable -XBangPatterns and define

mysum xs = sum' xs 0
   where sum' [] a = a
         sum' (x:xs) !a = sum' xs (x + a)


Then in GHCi

> mysum [1..50000000]
1250000025000000

which as you can easily check using n * (n + 1) / 2 with n = 50000000 is 
correct.

I don't know whether I have used the strictness annotation in the best 
way, but it works.

See http://www.haskell.org/haskellwiki/Performance/Strictness, which had 
some helpful hints.

Graham

On 23/03/2013 11:29 AM, Axel Wegen wrote:
> mukesh tiwari <mukeshtiwari.iiitm at gmail.com> writes:
>> It's already mentioned there "A String is represented as a list of
>> Char values; each element of a list is allocated individually, and has
>> some book-keeping overhead. These factors affect the memory
>> consumption and performance of a program that must read or write text
>> or binary data. On simple benchmarks like this, even programs written
>> in interpreted languages such as Python can outperform Haskell code
>> that uses String by an order of magnitude".
> I assumed that just means using plain Strings for that job will take
> more time.
>
>> import qualified Data.ByteString.Lazy.Char8 as BS
>> import Data.Maybe ( fromJust )
>>
>> readI :: BS.ByteString -> Integer
>> readI = fst . fromJust . BS.readInteger
>>
>> main = BS.interact sumFile where
>> sumFile = BS.pack . show . sum . map readI . BS.words
> I get the same result, the stack overflow. Though I don't have wait as
> long for it to break.
>
> I think that there is something about the summation that makes it
> impossible for the compiler to do it's magic and optimize the thing to
> something less stack overflowing. I just don't understand what that is.
>
> --
> Axel Wegen
>
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/mailman/listinfo/beginners




More information about the Beginners mailing list