[GHC] #9476: Implement late lambda-lifting
GHC
ghc-devs at haskell.org
Tue Dec 4 17:07:19 UTC 2018
#9476: Implement late lambda-lifting
-------------------------------------+-------------------------------------
Reporter: simonpj | Owner: sgraf
Type: feature request | Status: closed
Priority: normal | Milestone: 8.8.1
Component: Compiler | Version: 7.8.2
Resolution: fixed | Keywords: LateLamLift
Operating System: Unknown/Multiple | Architecture:
Type of failure: Runtime | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: #8763 #13286 | Differential Rev(s): Phab:D5224
Wiki Page: LateLamLift |
-------------------------------------+-------------------------------------
Comment (by sgraf):
Replying to [comment:70 simonpj]:
> * To do this, either we need to add a RTS flag or... doesn't it happen
anyway when we have a very small nursery? Each time the nursery runs out,
we GC, and with -G1 that's a major GC. What am I missing?
Yes, doing `-G1` and varying the initial heap size with `-A$iM` was
exactly what I have been doing to find out a close approximation of the
maximum residency (the one of related to program semantics, not GC
samples).
(Note that when a GC happens with `-G1`, the heap is resized to 2*bytes
copied according to `-S`, so the 'nursery' grows in `-G1`. Also I don't
think that we can vary GC frequency through an RTS flag, because to my
knowledge, the decision when to GC is entirely made by the mutator)
Example: Let's say I start with `-A40M` (always `-G1` from now on) and
let's assume that the picture shows the actual heap profile (e.g., with
maximum sampling frequency), where the points in time at which GC runs are
indicated in red:
[[Image(https://i.imgur.com/RpeuCXc.jpg, 400px)]]
The maximum residency observed by GC is smaller than the actual maximum
residency, which I call the maximum working set for the remainder of this
post for disambiguation. How can we make the sampled maximum residency
approximate the maximum working set? We can vary the GC frequency by
setting the initial heap size to a different value. It turns out that
within the integer range of 1 to 200, `-A56M` seems to be the candidate
with the closest approximation (closest meaning it sampled the highest
maximum residency):
[[Image(https://i.imgur.com/BfvbKSl.jpg, 400px)]]
In particular, we don't really care how often major GC happens, only that
one major GC happens ''immediately before'' the working set collapses.
The `-A56M` is highly specific to the program (in this case, `default`
above) and doesn't really matter, it's just the parameterisation where the
sampled maximum residency most closely approximates the actual maximum
working set.
It's similar to modular arithmetic, if that's of any help: Imagine you
want to find a number below 20 with the closest multiple to 73 (prime),
but not larger than that. There's 8*9=72 or 6*12 or 4*18, etc. It doesn't
really matter which factor you choose, point is that you get close to 73.
The specific parameterisation for `allow-cg` to approximate its maximum
working set was `-A76M -G1`, but it doesn't matter; what matters is that
the sampled maximum residency for `default` was lower than `allow-cg` (192
vs. 196MB), so everything as expected.
> Then I believe you are saying "Ignoring closure growth has no effect on
residency". Is that right?
Well, it's hard to tell how close those samples are to the actual maximum
working set, but under the assumption that they are close enough, I'd say
that ignoring closure growth indeed leads to a bigger working set in this
case (as to be expected), although it doesn't really manifest itself in
the majority of GC parameterisations I sampled (cf. the second-to-last
paragraph in comment:65).
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/9476#comment:71>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list