[GHC] #9520: Running an action twice uses much more memory than running it once

Fri Aug 19 08:48:22 UTC 2016

#9520: Running an action twice uses much more memory than running it once
-------------------------------------+-------------------------------------
        Reporter:  snoyberg          |                Owner:
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  7.8.3
      Resolution:                    |             Keywords:
Operating System:  Linux             |         Architecture:  x86_64
 Type of failure:  Runtime           |  (amd64)
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by edsko):

 I have no answers here, just more questions. I ran into this problem again
 with a large project that uses conduit. My program suffered from a large
 memory leak, and in the `-hy` profile the types were reported as `->Pipe`
 and `Sink`; moreover, the `-hc` profile told me memory was being retained
 by a CAF. All of this pointed to the exact problem discussed in this
 ticket, and indeed adding

 {{{#!hs
 {-# OPTIONS_GHC -fno-full-laziness #-}
 }}}

 to the top of my module got rid of the problem. However, I can't say I
 fully understand what is going on. Experimenting with @snoyberg 's
 examples, above, I noticed that the memory behaviour of these modules
 interacts in odd ways with profiling options, which doesn't make this any
 easier! For @snoyberg's first example
 (https://ghc.haskell.org/trac/ghc/ticket/9520#comment:4):

 {{{
     | No profiling | -prof   | -prof -fprof-auto
 ----+--------------+---------+------------------
 -O0 | OK           | OK      | OK
 -O1 | OK           | OK      | LEAK(1)
 -O2 | OK           | OK      | LEAK(1)
 }}}

 where OK means "runs in constant space" and LEAK(1) indicates a memory
 leak consisting of `Int`, `->Sink` and `Sink`, according to `+RTS -hy`. In
 other words, this has a memory leak ''only'' when ''both'' optimization
 ''and'' `-fprof-auto` are specified (`-fprof` by itself is not enough).

 Bizarrely, for the second example the behaviour is reversed (perhaps this
 is why Michael concluded that this example "however, does '''not'''
 demonstrate the problem"?):

 {{{
     | No profiling | -prof   | -prof -fprof-auto
 ----+--------------+---------+------------------
 -O0 | OK           | OK      | OK
 -O1 | LEAK         | LEAK(1) | OK
 -O2 | LEAK         | LEAK(1) | OK
 }}}

 Unlike for the first example, here we also get a LEAK without any
 profiling enabled (as indicated by a very high maximum residency reported
 by `+RTS -s`).

 I added a third example:

 {{{#!hs
 foreign import ccall "doNothing" doNothing :: IO ()

 data Sink i r = Sink (i -> Sink i r) r

 sinkCount :: Sink i Int
 sinkCount =
     loop 0
   where
     loop cnt = Sink (\_ -> loop $! cnt + 1) cnt

 feed :: Sink Char r -> IO r
 feed =
     loop 10000000
   where
     loop 0 (Sink _ g) = return g
     loop i (Sink f _) = doNothing >> loop (i - 1) (f 'A')

 action :: IO ()
 action = do
     feed sinkCount
     return ()

 main :: IO ()
 main = do
     action
     action
 }}}

 This differs from @snoyberg 's second example only in the additional call
 to `doNothing` in `feed.loop`; `doNothing` is defined in an external `.c`
 file:

 {{{#!c
 void doNothing() {}
 }}}

 (I used an externally defined C function because I wanted something that
 the optimizer couldn't get rid of but without getting all kinds of crud
 about `Handle`s etc in the core/STG output, which is what would happen
 with a print statement, say.) I have no idea why, but this program's
 memory behaviour is quite different from version 2:

 {{{
     | No profiling | -prof   | -prof -fprof-auto
 ----+--------------+---------+------------------
 -O0 | LEAK         | LEAK(2) | LEAK(2)
 -O1 | LEAK         | LEAK(1) | LEAK(1)
 -O2 | LEAK         | LEAK(1) | LEAK(1)
 }}}

 Now this program leaks no matter what we do; although LEAK(2) reported
 here, according to `RTS -hy`, consists of different type (a single type,
 in fact: `PAP`).

 Getting to the bottom of this would require more time than I currently
 have; I guess for me the take-away currently is: full laziness is
 dangerous when using free monads such as conduit.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/9520#comment:9>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler