[GHC] #11318: Data.Text.length allocates one closure per character

GHC ghc-devs at haskell.org
Wed Dec 30 15:46:05 UTC 2015


#11318: Data.Text.length allocates one closure per character
-------------------------------------+-------------------------------------
           Reporter:  bgamari        |             Owner:
               Type:  bug            |            Status:  new
           Priority:  normal         |         Milestone:
          Component:  Compiler       |           Version:  7.10.3
           Keywords:                 |  Operating System:  Unknown/Multiple
       Architecture:                 |   Type of failure:  Runtime
  Unknown/Multiple                   |  performance bug
          Test Case:                 |        Blocked By:
           Blocking:                 |   Related Tickets:  #11284
Differential Rev(s):                 |         Wiki Page:
-------------------------------------+-------------------------------------
 In #11284 I noticed that `map Data.Text.length` applied to a list of
 `Text` values would result in code which would allocate one closure per
 character due to lack of lambda lifting.

 It turns out it's even worse than this.

 Consider this example (inspired by #11284),
 {{{#!hs
 module Hi2 (hello) where

 import qualified Data.Text as T

 hello :: Int
 hello = T.length hi

 hi :: T.Text
 hi = T.pack "hello"
 }}}

 When compiled with GHC 7.10.3 with `-O`, the following simplified Core is
 produced,
 {{{#!hs
 hello :: Int
 hello =
   case unpackCString# "hello"#
   of _ { Text dt_a33D dt1_a33E dt2_a33F ->
   let {
     a_a33C :: Int#
     a_a33C = +# dt1_a33E dt2_a33F } in
   letrec {
     $wloop_length_s3bt :: Int# -> Int# -> Int#
     $wloop_length_s3bt =
       \ (ww_s3bk :: Int#) (ww_s3bo :: Int#) ->
         case tagToEnum# @ Bool (>=# ww_s3bo a_a33C) of _ {
           False ->
             case indexWord16Array# dt_a33D ww_s3bo of r#_a35n { __DEFAULT
 ->
             case tagToEnum# @ Bool (geWord# r#_a35n (__word 55296)) of _ {
               False -> $wloop_length_s3bt (+# ww_s3bk 1) (+# ww_s3bo 1);
               True ->
                 case tagToEnum# @ Bool (leWord# r#_a35n (__word 56319)) of
 _ {
                   False -> $wloop_length_s3bt (+# ww_s3bk 1) (+# ww_s3bo
 1);
                   True -> $wloop_length_s3bt (+# ww_s3bk 1) (+# ww_s3bo 2)
                 }
             }
             };
           True -> ww_s3bk
         }; } in
   case $wloop_length_s3bt 0 dt1_a33E of ww_s3bs { __DEFAULT ->
   I# ww_s3bs
   }
   }
 }}}

 Even this simple example produces a `length` function which allocations

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/11318>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list