[GHC] #5218: Add unpackCStringLen# to create Strings from string literals

GHC ghc-devs at haskell.org
Fri Apr 7 01:20:05 UTC 2017


#5218: Add unpackCStringLen# to create Strings from string literals
-------------------------------------+-------------------------------------
        Reporter:  tibbe             |                Owner:  thoughtpolice
            Type:  feature request   |               Status:  patch
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  7.0.3
      Resolution:                    |             Keywords:
Operating System:  Unknown/Multiple  |         Architecture:
 Type of failure:  Runtime           |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:  #5877 #10064      |  Differential Rev(s):  Phab:D2443
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by bgamari):

 Replying to [comment:60 winter]:
 > > What is stopping these libraries from providing this mechanism
 currently using Addr# and primitive strings directly?
 >
 > The problem is that there's no way to cast `Addr#` into `ByteArray#`
 without copy, while unboxed vector(not storable) and text both want
 `ByteArray#`.
 >
 Fair enough, but why not just poke a hole in the `ByteArray#` abstraction
 in that case? Namely, provide a `unsafeMkByteArray# :: Addr# -> Int ->
 ByteArray#`.

 > > In general primitive strings are, as the name would suggest,
 primitive. I'm not sure forcing a heap object representation here is
 necessary nor prudent.
 >
 > I disagree. If we give string literal a proper compact representation,
 not only we can save unnecessary copying during runtime, we can save code
 size in other ways.
 >
 > Consider if string literal now are `ByteArray#`s, we can use rules to
 simplify a UTF8 text type like `forall a. fromString (GHC.unpackCString#
 a) = UTF8 a`, that means we can directly use constructor here instead of
 several calls. The same applys for unbox vectors using unboxed string
 literal and hexdecimal notation.

 I agree that these are all great simplifications, but I don't see why
 changing the representation of primitive strings is necessary to get
 there.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/5218#comment:61>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list