[Haskell-begin] Maximum of a List?
Braden Shepherdson
Braden.Shepherdson at gmail.com
Sat Jul 26 22:45:05 EDT 2008
Steve Klabnik wrote:
> Hello everyone-
>
> I'm working on some project Euler problems today, and I'm stuck on one.
> It's not the problem itself that's the problem, it's that finding the
> maximum of a list makes me run out of heap space!
>
> http://projecteuler.net/index.php?section=problems&id=14
> <http://projecteuler.net/index.php?section=problems&id=14>
>
> my code:
>
> import Data.List
>
> f :: Int -> Int -> Int
> f acc x
> | x == 1 = acc
> | even x = f (acc + 1) (x `div` 2)
> | otherwise = f (acc + 1) (3 * x + 1)
>
> answer = (foldl' max 0) $ map (f 1) [1 .. 999999]
>
> I tried using 'foldl' max ' instead of 'maximum' because I thought that
> foldl' was supposed to work better than foldl or something...I could be
> confused on that point. Anyway, here's what I get...
>
> / _ \ /\ /\/ __(_)
> / /_\// /_/ / / | | GHC Interactive, version 6.4, for Haskell 98.
> / /_\\/ __ / /___| | http://www.haskell.org/ghc/
> \____/\/ /_/\____/|_| Type :? for help.
>
> Loading package base-1.0 ... linking ... done.
> Prelude> :l ..\test.hs
> Compiling Main ( ..\test.hs, interpreted )
> Ok, modules loaded: Main.
> *Main> answer
> GHC's heap exhausted: current limit is 268435456 bytes;
> Use the `-M<size>' option to increase the total heap size.
>
> I also tried writing a simple 'max' function that I thought was tail
> recursive, and that that would help. Same error. I'm finding it hard to
> believe that finding a maximum of a list of a million small integers
> would cause this kind of overflow...is it really that big of a problem?
Hello Steve,
First, is there a reason to use such an old GHC? Newer versions produce
much better code.
That said, certainly 6.4 is sufficient for this problem. foldl' is the
right choice here, it is strict in the accumulating parameter and that
is what you want. The problem is the two parameters in f. When you make
the recursive call 'f (acc + 1) ...', a thunk is being created to hold
that calculation (acc + 1). It isn't being evaluated right away, so the
thunk remains. On the next iteration, acc is incremented again. You are
effectively building up the expression 'acc + 1 + 1 + 1 + 1 ...',
perhaps quite deeply. This takes up heap space; apparently too much
space. Similarly, the (x `div` 2) and (3 * x + 1) calculations are being
suspended as thunks, though x gets forced on the next call, since it is
tested by the guards.
Adding the pragma
{-# LANGUAGE BangPatterns #-} at the top of the file, you could then
annotate acc:
f !acc x
this seems to run in constant space for me.
Note that there is a much more concise way to write f, though I doubt it
would be much faster. Consider 'iterate' and 'takeWhile'.
Finally, compiling to a binary with ghc -O2 will yield much better speed
than ghci. And GHC 6.8.3 would provide a further 20%+ speed boost, if
installing it is an option on your environment.
Hope this helps,
Braden Shepherdson
shepheb
More information about the Beginners
mailing list