[GHC] #9332: Memory blowing up for strict sum/strict foldl in ghci
GHC
ghc-devs at haskell.org
Mon Jul 21 14:47:20 UTC 2014
#9332: Memory blowing up for strict sum/strict foldl in ghci
-------------------------------------+-------------------------------------
Reporter: | Owner:
artella.coding | Status: closed
Type: bug | Milestone: 7.8.4
Priority: high | Version: 7.8.3
Component: GHCi | Keywords: ghci, strict,
Resolution: fixed | memory
Differential Revisions: | Operating System: Linux
Architecture: x86_64 | Type of failure: Runtime
(amd64) | performance bug
Difficulty: Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: |
-------------------------------------+-------------------------------------
Comment (by artella.coding):
"namely the tension between sharing computation and saving space." - but
in this case there doesn't seem to be any sharing of computation (unless I
am mistaken). The key things I noted were :
a)The result of `takeWhile (< 10000000) longList` seems to be persisting,
and not `longList` itself
b)The result of `takeWhile (< 10000000) longList` is only used once in
subsequent invocations of `main`, and therefore it is not shared
'''between''' computations
My way that I arrived at my deductions were as follows :
a)Firstly we note that it is the list resulting from `takeWhile (<
10000000) longList` which seems to be persisting, and not `longList`
itself. This is supported by the fact that if you replace it with
`takeWhile (< 10) longList` the program no longer blows up. If it was
`longList` that was persisting, then the program would blow up in both
cases.
b)So given that `takeWhile (< 10000000) longList` persists (and not
`longList`), the question then remains as to whether `takeWhile (<
10000000) longList` is '''shared between subsequent''' computations. In
order to test this I put a `trace` command to see whether `takeWhile (<
10000000) longList` is invoked twice when `main` is run twice from within
ghci. That is if we modify the program in comment 10
(https://ghc.haskell.org/trac/ghc/ticket/9332#comment:10) to the following
:
{{{
--Sort.hs
module Sort (result) where
import Data.List
import Debug.Trace
longList::[Int]
longList = [1 .. ]
result :: Int
result = foldl' (+) 0 $ (trace "test") takeWhile (< 10) longList
}}}
and :
{{{
--Main.hs
module Main (main) where
import Sort (result)
main = do
print $ result
}}}
Then we find that upon running `main` twice in ghci we get :
{{{
> main
test
45
> main
45
}}}
So we note that `"test"` is only printed out once, which indicates that
once `result` is calculated its value is cached, which means that the list
arising from `takeWhile (< 10) longList` is only used once, and hence is
not shared between subsequent computations. Therefore if the result of
`takeWhile (< 10) longList` is not shared between subsequent computations,
why does it persist in memory?
Thanks
Replying to [comment:11 simonpj]:
> There is a difficult tension here, described in Chapter 23 of my
[http://research.microsoft.com/en-us/um/people/simonpj/papers/slpj-
book-1987/index.htm 1987 book], namely the tension between sharing
computation and saving space. I do not know of any comprehensive answer.
>
> The problem tends to bite less when (as is commonly the case) numbers
like `1000000` don't appear in your program but rather are computed from
the input or the command line flags. But it's still definitely a problem
(e.g. with `[1..]`.)
>
> A possible solution might be to revert any CAFs that are retaining a
great deal of space, by turning them back into their un-computed form.
(Of course, their uncomputed form might retain a great deal of space too!)
>
> Simon
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/9332#comment:12>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list