[Haskell-cafe] Thunks and GHC pessimisation

Tom Ellis tom-lists-haskell-cafe-2013 at jaguarpaw.co.uk
Sun Feb 24 18:49:04 CET 2013


To avoid retaining a large lazy data structure in memory it is useful to
hide it behind a function call.  Below, "many" is used twice.  It is hidden
behind a function call so it can be garbage collected between uses.  That's
good.  When compiling with "-O" it seems that GHC 7.4.1 decides to keep it
in memory anyway.  That's bad.  (I can't read core so I don't know exactly
what's going on).  Replacing one of the "many" in "twice" with
"different_many" makes everything fine again.

Is this considered a bug in GHC?  Is it a known bug?  It is incredibly
concerning that GHC would perform this kind of pessimisation.

Tom


% cat thunkfail.hs                                                               
{-# OPTIONS_GHC -fno-warn-unused-binds #-}

import Data.List

million :: Int
million = 10 ^ (6 :: Int)

many :: () -> [Int]
many () = [1..million]
                  
different_many :: () -> [Int]
different_many () = [1..million]

twice :: (Int, Int)
twice = (foldl' (+) 0 (many ()), foldl' (+) 0 (many ()))

main :: IO ()
main = print twice

% ghc -fforce-recomp -Wall -Werror -rtsopts thunkfail.hs && ./thunkfail +RTS -M5M 
[1 of 1] Compiling Main             ( thunkfail.hs, thunkfail.o )
Linking thunkfail ...
(1784293664,1784293664)

% ghc -O -fforce-recomp -Wall -Werror -rtsopts thunkfail.hs && ./thunkfail +RTS -M5M
[1 of 1] Compiling Main             ( thunkfail.hs, thunkfail.o )
Linking thunkfail ...
Heap exhausted;
Current maximum heap size is 5242880 bytes (5 MB);
use `+RTS -M<size>' to increase it.



More information about the Haskell-Cafe mailing list