[GHC] #5218: Add unpackCStringLen# to create Strings from string literals
GHC
ghc-devs at haskell.org
Wed Aug 16 05:12:56 UTC 2017
#5218: Add unpackCStringLen# to create Strings from string literals
-------------------------------------+-------------------------------------
Reporter: tibbe | Owner: thoughtpolice
Type: feature request | Status: patch
Priority: normal | Milestone:
Component: Compiler | Version: 7.0.3
Resolution: | Keywords: strings
Operating System: Unknown/Multiple | Architecture:
Type of failure: Runtime | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: #5877 #10064 | Differential Rev(s): Phab:D2443
#11312, #9719 |
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by winter):
After thinking about this for a while, i think a better solution for
literal problem is to overhaul the `OverloaddedStrings` extension, because
the problem is created by it: without it we can't even use rewrite rules
to get the `Addr#` at all.
I think i will make a GHC proposal finally, but let me sketch a little bit
on my idea:
1. Currently when `OverloaddedStrings` is enabled, we consider a string
literal polymorphric by translating them to `fromString ...`, where
`fromString` is a method from `IsString` type class.
2. This makes desugaring literal into `String` the first step in during
literal compiling, and at this very step we choose to use `unpackCString#
addr#` to desugar the literal.
3. Now we have a problem with the fixed desugaring scheme, it's not
flexible enough to give arise a `ByteArray#` based representation, no
matter what rewrite-rules are applied afterwards.
4. So i proposal to solve the problem directly at this language extension
level, besides `IsString`, i propose to add following typeclass:
{{{
class IsPlainAddr a where
fromPlainAddr :: Addr# -> a
class IsAsciiByteArray a where
fromAsciiByteArray :: ByteArray# -> a
class IsU8ByteArray a where
fromU8ByteArray :: ByteArray# -> a
-- maybe someone want utf-16 desugaring? we can add later
}}}
5. Together with `IsString`, these typeclasses are special when
`OverloaddedStrings` is enabled, we will try to find an instance of the
type which we are overloading: If we have a `"Foo" :: Foo`, we will try to
find a instance for `Foo` among these classes, the priority of those
instances can have an arbitrary order as long as we document it clearly.
6. Once an instance is found, we do desugaring depending on the instance
spec, and directly inject `fromXXX "xxx"#` into code, if the sourcecode
codepoint can't be encoded with the instance spec, we issue a compile
waring.
7. If we failed to find an instance from above, we issue an compile error.
8. Now a library author can choose to implement a type class which suit
his/her need.
9. This solution can also be extended to handle `OverloadedLists`, e.g.
we can add following typeclass for desugaring list literals:
{{{
class IsIntList a where
fromIntList :: ByteArray# -> a
class IsWordList a where
fromWordList :: ByteArray# -> a
class IsInt8List a where
fromInt8List :: ByteArray# -> a
...
}}}
10. When `OverloaddedLists` is enabled, we will try to find an instance of
these special classes, and transform the list into `ByteArray#` according
to the instance spec, if there're overflowing we issue warnings.
If later, people ask for new format of literal desugaring, we add new
typeclasses and done, old code continue to work, and new code will got a
compile error on old compilers.
BTW. I think this is what we called "解铃还需系铃人" in chinese ; )
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/5218#comment:82>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list