[Haskell-cafe] Thunks and GHC pessimisation

Don Stewart dons00 at gmail.com
Sun Feb 24 19:25:56 CET 2013


In toy examples like this it will be generally hard to convince GHC not to
just collapse your program down to a constant, when you're turning up the
optimization level.

In particular, you are implying -ffull-laziness with -O (or -O2), which can
increase sharing.

> GHC doesn't implement complete full-laziness. When optimisation in on,
and -fno-full-laziness is not  given, *some transformations that increase
sharing are performed, such as extracting repeated computations from a loop*.
These are the same transformations that a fully lazy implementation would
do, the difference is that GHC doesn't consistently apply full-laziness, so
don't rely on it.

http://www.haskell.org/ghc/docs/latest/html/users_guide/options-optimise.html

If you explicitly rely on this not happening, turn it off:

$ ghc -O2 -fno-full-laziness --make A.hs -rtsopts -fforce-recomp
[1 of 1] Compiling Main             ( A.hs, A.o )
Linking A ...

$ time ./A +RTS -M750k
(500000500000,500000500000)
./A +RTS -M750k  0.06s user 0.00s system 97% cpu 0.069 total

A 750k heap should be enough for anyone :)

-- Don

On Sun, Feb 24, 2013 at 5:49 PM, Tom Ellis <
tom-lists-haskell-cafe-2013 at jaguarpaw.co.uk> wrote:

> To avoid retaining a large lazy data structure in memory it is useful to
> hide it behind a function call.  Below, "many" is used twice.  It is hidden
> behind a function call so it can be garbage collected between uses.  That's
> good.  When compiling with "-O" it seems that GHC 7.4.1 decides to keep it
> in memory anyway.  That's bad.  (I can't read core so I don't know exactly
> what's going on).  Replacing one of the "many" in "twice" with
> "different_many" makes everything fine again.
>
> Is this considered a bug in GHC?  Is it a known bug?  It is incredibly
> concerning that GHC would perform this kind of pessimisation.
>
> Tom
>
>
> % cat thunkfail.hs
> {-# OPTIONS_GHC -fno-warn-unused-binds #-}
>
> import Data.List
>
> million :: Int
> million = 10 ^ (6 :: Int)
>
> many :: () -> [Int]
> many () = [1..million]
>
> different_many :: () -> [Int]
> different_many () = [1..million]
>
> twice :: (Int, Int)
> twice = (foldl' (+) 0 (many ()), foldl' (+) 0 (many ()))
>
> main :: IO ()
> main = print twice
>
> % ghc -fforce-recomp -Wall -Werror -rtsopts thunkfail.hs && ./thunkfail
> +RTS -M5M
> [1 of 1] Compiling Main             ( thunkfail.hs, thunkfail.o )
> Linking thunkfail ...
> (1784293664,1784293664)
>
> % ghc -O -fforce-recomp -Wall -Werror -rtsopts thunkfail.hs && ./thunkfail
> +RTS -M5M
> [1 of 1] Compiling Main             ( thunkfail.hs, thunkfail.o )
> Linking thunkfail ...
> Heap exhausted;
> Current maximum heap size is 5242880 bytes (5 MB);
> use `+RTS -M<size>' to increase it.
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20130224/503951ba/attachment.htm>


More information about the Haskell-Cafe mailing list