[GHC] #5218: Add unpackCStringLen# to create Strings from string literals
GHC
ghc-devs at haskell.org
Sat Aug 19 19:16:24 UTC 2017
#5218: Add unpackCStringLen# to create Strings from string literals
-------------------------------------+-------------------------------------
Reporter: tibbe | Owner: thoughtpolice
Type: feature request | Status: patch
Priority: normal | Milestone:
Component: Compiler | Version: 7.0.3
Resolution: | Keywords: strings
Operating System: Unknown/Multiple | Architecture:
Type of failure: Runtime | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: #5877 #10064 | Differential Rev(s): Phab:D2443
#11312, #9719 |
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by bgamari):
> I believe that one goal is
>
> > The ability to put a block of binary data in the program code,
without heavy encoding.
>
> Is that a goal? Can we focus solely on that for a while?
That is indeed a goal. Another in my mind equally-important goal is to be
able to support constant-time length given the literals we write today.
This is important for being able to efficiently implement things like
`bytestring` `Builder` in terms of `memcpy`.
> Or `B#` could be `ByteArray#`. That would have the major advantage of
not adding a new type, and for sure we'd need to be able to turn it into a
`ByteArray#`. So I like that, and it's what jscholl suggests in
comment:74.
One issue with this is that we would gain a word of overhead for each
string literal (assuming we use `B#` to implement Haskell's `String`). I
took some measurements a while back and found that short strings are quite
common, so this may be a hard cost to accept.
> I don't have clarity on how bytestring would want to convert a
`ByteArray#` to a `ByteString`. That ought to be a constant time
operation.
I may be missing something but I don't believe this should pose any
particular trouble. Afterall, a `ByteString` is essentially a `ForeignPtr`
and some length/offset information. `byteArrayContents#` allows us to get
the `Addr#` from a (pinned) `ByteArray#`, from which we can construct a
`Ptr` and in turn a `ForeignPtr`. Since we are talking about static data
`byteArrayContents#` should be safe.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/5218#comment:91>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list