[GHC] #5218: Add unpackCStringLen# to create Strings from string literals

GHC ghc-devs at haskell.org
Sat Aug 19 19:16:24 UTC 2017


#5218: Add unpackCStringLen# to create Strings from string literals
-------------------------------------+-------------------------------------
        Reporter:  tibbe             |                Owner:  thoughtpolice
            Type:  feature request   |               Status:  patch
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  7.0.3
      Resolution:                    |             Keywords:  strings
Operating System:  Unknown/Multiple  |         Architecture:
 Type of failure:  Runtime           |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:  #5877 #10064      |  Differential Rev(s):  Phab:D2443
  #11312, #9719                      |
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by bgamari):

 > I believe that one goal is
 >
 > >   The ability to put a block of binary data in the program code,
 without heavy encoding.
 >
 > Is that a goal? Can we focus solely on that for a while?

 That is indeed a goal. Another in my mind equally-important goal is to be
 able to support constant-time length given the literals we write today.
 This is important for being able to efficiently implement things like
 `bytestring` `Builder` in terms of `memcpy`.

 > Or `B#` could be `ByteArray#`. That would have the major advantage of
 not adding a new type, and for sure we'd need to be able to turn it into a
 `ByteArray#`. So I like that, and it's what jscholl suggests in
 comment:74.

 One issue with this is that we would gain a word of overhead for each
 string literal (assuming we use `B#` to implement Haskell's `String`). I
 took some measurements a while back and found that short strings are quite
 common, so this may be a hard cost to accept.

 > I don't have clarity on how bytestring would want to convert a
 `ByteArray#` to a `ByteString`. That ought to be a constant time
 operation.

 I may be missing something but I don't believe this should pose any
 particular trouble. Afterall, a `ByteString` is essentially a `ForeignPtr`
 and some length/offset information. `byteArrayContents#` allows us to get
 the `Addr#` from a (pinned) `ByteArray#`, from which we can construct a
 `Ptr` and in turn a `ForeignPtr`. Since we are talking about static data
 `byteArrayContents#` should be safe.

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/5218#comment:91>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list