[GHC] #5218: Add unpackCStringLen# to create Strings from string literals

GHC ghc-devs at haskell.org
Tue Apr 11 17:29:50 UTC 2017


#5218: Add unpackCStringLen# to create Strings from string literals
-------------------------------------+-------------------------------------
        Reporter:  tibbe             |                Owner:  thoughtpolice
            Type:  feature request   |               Status:  patch
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  7.0.3
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
 Type of failure:  Runtime           |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:  #5877 #10064      |  Differential Rev(s):  Phab:D2443
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by winter):

 It looks like we have to make a space-time trade off here, if `ByteArray#`
 's overhead is too large, adding a `Int#` will also cost a lot. Suppose
 GHC is smart enough to float all string constant out, then copying once in
 runtime is acceptable. Otherwise i would still want to switch space for
 time.

 Here i also propose another solution: We still keep primitive literal's
 type as `Addr#`, but encode the literal's byte length with UTF8 rules,
 that is using one byte for length less than 0x7F, two bytes for length
 less than 0x7FF...and so on. Then put these UTF8 encoded length header
 bytes in front of real bytes content.

 Now all we have to do left is to add a new unpack function
 `unpackGHCString#` to decode these header bytes first(we can reuse UTF8
 code!!!), then use `memcpy` or whatever you want to do with length info.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/5218#comment:70>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list