[GHC] #9400: poor performance when compiling modules with many Text literals at -O1

GHC ghc-devs at haskell.org
Mon Aug 4 03:23:40 UTC 2014


#9400: poor performance when compiling modules with many Text literals at -O1
-------------------------------------+-------------------------------------
              Reporter:  rwbarton    |            Owner:
                  Type:  bug         |           Status:  closed
              Priority:  normal      |        Milestone:
             Component:  Compiler    |          Version:  7.8.3
            Resolution:  invalid     |         Keywords:
      Operating System:              |     Architecture:  Unknown/Multiple
  Unknown/Multiple                   |       Difficulty:  Unknown
       Type of failure:  Compile-    |       Blocked By:
  time performance bug               |  Related Tickets:  #9370
             Test Case:              |
              Blocking:              |
Differential Revisions:              |
-------------------------------------+-------------------------------------
Changes (by rwbarton):

 * status:  new => closed
 * resolution:   => invalid


Comment:

 OK this is a bit funny.

 Normally a Text literal `"abc"` gets desugared as
 {{{
 fromString $fIsStringText (unpackCString# "abc"#)
 }}}
 Now `fromString $fIsStringText = pack`, and `pack = unstream . S.map safe
 . S.streamList`, and there is a rule in `Data.Text`
 {{{
 {-# RULES "TEXT literal" forall a.
     unstream (S.map safe (S.streamList (GHC.unpackCString# a)))
       = unpackCString# a #-}
 }}}
 and `Data.Text.unpackCString#` has a NOINLINE pragma so we end up with the
 nice small code: `Data.Text.unpackCString# "abc"`.

 ''But'', a ''single-character'' literal `"a"` is instead desugared as
 {{{
 fromString $fIsStringText (: (C# 'a') ([]))
 }}}
 and now there is no rule which matches this pattern. And `unstream` is
 marked `INLINE [0]`, as Simon predicted; and it's rather large. And most
 XML entities represent single Unicode characters, so GHC generated around
 2000 copies of `unstream`.

 I don't know why there is an `INLINE` pragma on `unstream`. Perhaps no
 good reason. But anyways, there is a simple fix to the text package: add
 another rule to match the pattern `unstream (S.map safe (S.streamList
 [c]))`. (And similarly for empty string literals.)

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/9400#comment:3>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list