[GHC] #13226: Compiler allocation regressions from top-level string literal patch

GHC ghc-devs at haskell.org
Fri Feb 3 00:32:40 UTC 2017


#13226: Compiler allocation regressions from top-level string literal patch
-------------------------------------+-------------------------------------
        Reporter:  dfeuer            |                Owner:
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:  8.2.1
       Component:  Compiler          |              Version:  8.1
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
 Type of failure:  Compile-time      |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by dfeuer):

 I've spent a bit more time looking at T212425. A couple things I've
 noticed:

 One reason we're getting more terms and types is pretty obvious. Where we
 used to have

 {{{#!hs
 lvl3_r2p3 = unpackCString# "T12425.hs"#
 }}}

 we now have

 {{{#!hs
 lvl6_r2GO = "T12425.hs"#
 lvl7_r2GP = unpackCString# lvl6_r2GO
 }}}

 Another reason is less obvious. We've always had multiple copies of some
 strings, but now we're getting ''more'' repeats. I don't know why that is.

 I also don't really if any of this is actually responsible for the
 allocation change, but it could be responsible for some of it. It looks
 like there are more terms, types and allocations for every stage past
 parsing, but float out is (unsurprisingly) prominent.

 My big question is why we're allowing these literals to be duplicated in
 the first place. I would think, naively perhaps, that we could assign each
 of them a unique, top-level identity from the start, and merely mark use-
 sites as inline or not, rather than actually inlining. When floating out,
 we could merge any two bindings in the same scope that refer to the same
 string identity. Those top-level identities would ultimately be resolved
 as either "inline-only" (not appearing in any unfoldings, etc., so they
 can be discarded) or otherwise. Of course we'd need to deal with the fact
 that built-in rules and such may be able to produce new literals, but I
 would think that relatively straightforward.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/13226#comment:2>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list