[GHC] #15126: Opportunity to compress common info table representation.

GHC ghc-devs at haskell.org
Sun May 6 18:30:49 UTC 2018


#15126: Opportunity to compress common info table representation.
-------------------------------------+-------------------------------------
           Reporter:  AndreasK       |             Owner:  (none)
               Type:  task           |            Status:  new
           Priority:  normal         |         Milestone:  8.6.1
          Component:  Compiler       |           Version:  8.2.2
           Keywords:  CodeGen        |  Operating System:  Unknown/Multiple
       Architecture:                 |   Type of failure:  None/Unknown
  Unknown/Multiple                   |
          Test Case:                 |        Blocked By:
           Blocking:                 |   Related Tickets:
Differential Rev(s):                 |         Wiki Page:
-------------------------------------+-------------------------------------
 I've looked at a lot of GHC produced assembly recently and noticed that
 most info tables
 describing stacks have the form:

 {{{
 .align 8
         .long   SDjR_srt-(block_cHmk_info)+296
         .long   0
         .quad   6151
         .quad   4294967326
 }}}

 I haven't managed to dig fully into the description however some
 observations:

 * I noticed that the second .long directive almost always ends up being
 zero.
 * When figuring out what is what I realized the first quad (describing the
 pointers) is almost never fully used.
 * The last entrie (closure type + ?), here `4294967326` also seems quite
 repetitive given the size reserved.

 So I looked in detail at spectral/simple:
 * There are 2012 info tables of this sort with all of them having a zero
 in the second long.
 * We also reserve 8 byte for the stack layout. However only a single of
 these tables requires more than 4 byte.

 The compiled module has a size of 276384 Bytes, with 16092 being
 redundant:
 * 4 bytes for 0
 * 4 bytes unused stack description
 * times 2012 info tables.

 That is an overhead of 5,8% which seems like quite a lot to me.

 The questions where to put that information is a different one. But only
 looking at the data and not how it is used tagging the pointer to the SRT
 table seems like a possibility.

 The info table description `4294967326` also appeared over 1k times. Maybe
 it's possible to come up with a more efficient encoding there as well.

 I didn't give it much thought yet since I don't have the time to do
 anything about it in the near future.
 But putting it here in case anyone is interested or looks into this in the
 future.

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15126>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list