[GHC] #7307: Share top-level code for strings

GHC ghc-devs at haskell.org
Wed Aug 16 00:00:50 UTC 2017


#7307: Share top-level code for strings
-------------------------------------+-------------------------------------
        Reporter:  simonpj           |                Owner:  parcs
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  7.6.1
      Resolution:                    |             Keywords:  strings
Operating System:  Unknown/Multiple  |         Architecture:
 Type of failure:  Runtime           |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------
Changes (by bgamari):

 * keywords:   => strings


Old description:

> A string constant in GHC turns into
> {{{
> foo :: String
> foo = unpackCString# "the-string'
> }}}
> This is a top-level thunk, and it expands into rather a lot of code like
> this
> {{{
> .text
>         .align 4,0x90
>         .long   0
>         .long   22
> .globl _Foo_zdfTypeableTzuds1_info
> _Foo_zdfTypeableTzuds1_info:
> .LcvI:
>         movl %esi,%eax
>         leal -12(%ebp),%ecx
>         cmpl 84(%ebx),%ecx
>         jb .LcvQ
>         addl $8,%edi
>         cmpl 92(%ebx),%edi
>         ja .LcvS
>         movl $_stg_CAF_BLACKHOLE_info,-4(%edi)
>         movl 100(%ebx),%ecx
>         movl %ecx,0(%edi)
>         leal -4(%edi),%ecx
>         pushl %ecx
>         pushl %eax
>         pushl %ebx
>         movl %eax,76(%esp)
>         call _newCAF
>         addl $12,%esp
>         testl %eax,%eax
>         je .LcvL
>         movl $_stg_bh_upd_frame_info,-8(%ebp)
>         leal -4(%edi),%eax
>         movl %eax,-4(%ebp)
>         movl $_cvJ_str,-12(%ebp)
>         addl $-12,%ebp
>         jmp _ghczmprim_GHCziCString_unpackCStringzh_info
> .LcvL:
>         movl 64(%esp),%eax
>         jmp *(%eax)
> .LcvS:
>         movl $8,116(%ebx)
> .LcvQ:
>         movl %eax,%esi
>         jmp *-12(%ebx)
> }}}
> That's rather a lot of goop for one thunk!  Of course we can share this,
> by making a 2-word thunk like this:
> {{{
> ------------------------------
> | TopUnpack_info  |   -------|-----> "the-string"#
> ------------------------------
> }}}
> where `TopUnpack_info` is a shared RTS info-table and code that embodies
> the code fragment above.
>
> This would save useless code bloat for every constant string.  (This came
> up when looking at the code generated by `deriving(Typeable)`.)

New description:

 A string constant in GHC turns into
 {{{#!hs
 foo :: String
 foo = unpackCString# "the-string'
 }}}
 This is a top-level thunk, and it expands into rather a lot of code like
 this
 {{{
 .text
         .align 4,0x90
         .long   0
         .long   22
 .globl _Foo_zdfTypeableTzuds1_info
 _Foo_zdfTypeableTzuds1_info:
 .LcvI:
         movl %esi,%eax
         leal -12(%ebp),%ecx
         cmpl 84(%ebx),%ecx
         jb .LcvQ
         addl $8,%edi
         cmpl 92(%ebx),%edi
         ja .LcvS
         movl $_stg_CAF_BLACKHOLE_info,-4(%edi)
         movl 100(%ebx),%ecx
         movl %ecx,0(%edi)
         leal -4(%edi),%ecx
         pushl %ecx
         pushl %eax
         pushl %ebx
         movl %eax,76(%esp)
         call _newCAF
         addl $12,%esp
         testl %eax,%eax
         je .LcvL
         movl $_stg_bh_upd_frame_info,-8(%ebp)
         leal -4(%edi),%eax
         movl %eax,-4(%ebp)
         movl $_cvJ_str,-12(%ebp)
         addl $-12,%ebp
         jmp _ghczmprim_GHCziCString_unpackCStringzh_info
 .LcvL:
         movl 64(%esp),%eax
         jmp *(%eax)
 .LcvS:
         movl $8,116(%ebx)
 .LcvQ:
         movl %eax,%esi
         jmp *-12(%ebx)
 }}}
 That's rather a lot of goop for one thunk!  Of course we can share this,
 by making a 2-word thunk like this:
 {{{
 ------------------------------
 | TopUnpack_info  |   -------|-----> "the-string"#
 ------------------------------
 }}}
 where `TopUnpack_info` is a shared RTS info-table and code that embodies
 the code fragment above.

 This would save useless code bloat for every constant string.  (This came
 up when looking at the code generated by `deriving(Typeable)`.)

--

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/7307#comment:14>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list