[GHC] #5218: Add unpackCStringLen# to create Strings from string literals
GHC
ghc-devs at haskell.org
Fri Aug 18 15:56:59 UTC 2017
#5218: Add unpackCStringLen# to create Strings from string literals
-------------------------------------+-------------------------------------
Reporter: tibbe | Owner: thoughtpolice
Type: feature request | Status: patch
Priority: normal | Milestone:
Component: Compiler | Version: 7.0.3
Resolution: | Keywords: strings
Operating System: Unknown/Multiple | Architecture:
Type of failure: Runtime | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: #5877 #10064 | Differential Rev(s): Phab:D2443
#11312, #9719 |
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by simonpj):
I'm struggling to grok this ticket, especially: '''what is the problem we
are trying to solve?'''. I'm also concerned about making things too
complicated.
''jscholl in comment:74 sounds right on target to me''. Here's my
thinking, written out. Let's see if we agree at least about the "Goals"
and "Core" part.
== Goals ==
I believe that one goal is
* '''The ability to put a block of binary data in the program code,
without heavy encoding.'''
Is that a goal? Can we focus solely on that for a while?
== Core ==
To meet that goal, in Core, we need
* A primitive data type `B#` whose values are simply blobs of binary data.
* Some operations over this type; e.g. `lenB :: B# -> Int`, or
`unpackString :: B# -> [Char]` or whatever.
* Literal values (in Core) for `B#` values.
`B#` plays the role of the `(# Int#, Addr# #)` representation mentioned
above (comment:38 ff), but without being so concrete.
I'm only using "`B#`" as a placeholder; we need a proper name for it! So
what is it, precisely?
* `B#` could be a completely new primitive type.
* Or `B#` could be `ByteArray#`. That would have the major advantage of
not adding a new type, and for sure we'd need to be able to turn it into a
`ByteArray#`. So I like that, and it's what jscholl suggests in
comment:74.
* But `B#` can't be `Addr#` (which is a memory address)! Also look at
#11312, which is highly relevant because it has the same conclusion. In
#11312, I call this new type `String#`, but that's too character-oriented.
I think we should focus on binary data. But adopting `B#` would fix the
ghastly problems in #11312.
== Haskell ==
If we had this new primitive type, we'd soon want literals for it in
Haskell source code.
* I suppose we could have a new literal syntax (about whose details I am
intensely relaxed). After all, the literals of a language should be
expressible I suppose.
* But we could say you could only get it via a TH quasiquote e.g. `[bytes|
fec923ac |]`? Is that so terrible?
Note that everything in comment:84 belongs in this section. By the time
we get to Core all this typeclass stuff has gone away.
== Other goals ==
I don't have clarity on how `bytestring` would want to convert a
`ByteArray#` to a `ByteString`. That ought to be a constant time
operation.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/5218#comment:86>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list