[GHC] #5218: Add unpackCStringLen# to create Strings from string literals
GHC
ghc-devs at haskell.org
Tue Apr 11 17:29:50 UTC 2017
#5218: Add unpackCStringLen# to create Strings from string literals
-------------------------------------+-------------------------------------
Reporter: tibbe | Owner: thoughtpolice
Type: feature request | Status: patch
Priority: normal | Milestone:
Component: Compiler | Version: 7.0.3
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Runtime | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: #5877 #10064 | Differential Rev(s): Phab:D2443
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by winter):
It looks like we have to make a space-time trade off here, if `ByteArray#`
's overhead is too large, adding a `Int#` will also cost a lot. Suppose
GHC is smart enough to float all string constant out, then copying once in
runtime is acceptable. Otherwise i would still want to switch space for
time.
Here i also propose another solution: We still keep primitive literal's
type as `Addr#`, but encode the literal's byte length with UTF8 rules,
that is using one byte for length less than 0x7F, two bytes for length
less than 0x7FF...and so on. Then put these UTF8 encoded length header
bytes in front of real bytes content.
Now all we have to do left is to add a new unpack function
`unpackGHCString#` to decode these header bytes first(we can reuse UTF8
code!!!), then use `memcpy` or whatever you want to do with length info.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/5218#comment:70>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list