[GHC] #9675: Unreasonable memory usage on large data structures
GHC
ghc-devs at haskell.org
Mon Oct 13 08:43:40 UTC 2014
#9675: Unreasonable memory usage on large data structures
-------------------------------------+-------------------------------------
Reporter: Polarina | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.8.3
Resolution: | Keywords:
Operating System: Linux | Architecture: x86_64 (amd64)
Type of failure: Compile- | Difficulty: Unknown
time performance bug | Blocked By:
Test Case: | Related Tickets:
Blocking: |
Differential Revisions: |
-------------------------------------+-------------------------------------
Comment (by simonpj):
I agree that it is tantalising that so much `Core` is necessary to
generate so little object code. Nevertheless, I am ''deeply'' reluctant
to fiddle with the representation of `Core`. Doing so will impose
overheads on every pass that deals with Core, and I'm far from convinced
that it'll solve the problem.
Anything along these lines would risk breaking optimisations. For example,
here is how GHC optimises repeated evaluation of the same data
constructor. Consider
{{{
case x of
K a _ _ _ -> ... (case x of
K _ _ c _ -> ...a..c..) ...
}}}
Clearly we'd like to optimise this to
{{{
case x of
K a _ c _ -> ... ( ...a..c.. ) ...
}}}
And GHC does so by (effecitvely) transforming to
{{{
case x of
K a bb cc dd -> let x = K a bb cc dd in
... (case x of
K _ _ c _ -> ...a..c..) ...
}}}
The `let` shadows the outer binding of `x`. Now the inner `case` can see
that `x` is bound to `(K a bb cc dd)`, and so it's easy to do case-of-
known-constructor to the inner case, giving
{{{
case x of
K a bb cc dd -> let x = K a bb cc dd in
... (let c = cc in ...a..c..) ...
}}}
(The `let` doesn't really exist, of course, it's only in the Simplifier's
head.)
All this is threatened if the outer case binds only `a`.
I think a decent alternative approach would be not to generate code for
selectors (at least when there are many fields) but rather to inline them
at usage sites, via a so-called "compulsory unfolding". That too risks
lots of code if you use a lot of record selectors, so it's not exactly a
solid fix.
The other thing is that I think it's quite likely that GHC has accumulated
a space leak. The compiler makes repeated passes over the `Core` program.
To avoid a leak, all vestiges of the input of each pass should be gone by
the time we have computed the output of that pass. But I would not be
amazed if that was not true at all -- that vesiges of pass 1 were still in
memory when we were on pass 10.
I would '''love''' someone to really look at this for us. It's not easy,
but it could give a huge payoff for every GHC compilation, not just these
weird corner cases. I certainly think we should be sure this is ''not''
happening before investing effort in optimising particular cases
Simon
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/9675#comment:7>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list