[GHC] #5218: Add unpackCStringLen# to create Strings from string literals

GHC ghc-devs at haskell.org
Wed Aug 16 05:12:56 UTC 2017


#5218: Add unpackCStringLen# to create Strings from string literals
-------------------------------------+-------------------------------------
        Reporter:  tibbe             |                Owner:  thoughtpolice
            Type:  feature request   |               Status:  patch
        Priority:  normal            |            Milestone:
       Component:  Compiler          |              Version:  7.0.3
      Resolution:                    |             Keywords:  strings
Operating System:  Unknown/Multiple  |         Architecture:
 Type of failure:  Runtime           |  Unknown/Multiple
  performance bug                    |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:  #5877 #10064      |  Differential Rev(s):  Phab:D2443
  #11312, #9719                      |
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by winter):

 After thinking about this for a while, i think a better solution for
 literal problem is to overhaul the `OverloaddedStrings` extension, because
 the problem is created by it: without it we can't even use rewrite rules
 to get the `Addr#` at all.

 I think i will make a GHC proposal finally, but let me sketch a little bit
 on my idea:

 1. Currently when `OverloaddedStrings` is enabled, we consider a string
 literal polymorphric by translating them to `fromString ...`, where
 `fromString` is a method from `IsString` type class.

 2. This makes desugaring literal into `String` the first step in during
 literal compiling, and at this very step we choose to use `unpackCString#
 addr#` to desugar the literal.

 3. Now we have a problem with the fixed desugaring scheme, it's not
 flexible enough to give arise a `ByteArray#` based representation, no
 matter what rewrite-rules are applied afterwards.

 4. So i proposal to solve the problem directly at this language extension
 level, besides `IsString`, i propose to add following typeclass:


 {{{
 class IsPlainAddr a where
     fromPlainAddr :: Addr# -> a

 class IsAsciiByteArray a where
     fromAsciiByteArray :: ByteArray# -> a

 class IsU8ByteArray a where
     fromU8ByteArray :: ByteArray# -> a

 -- maybe someone want utf-16 desugaring? we can add later
 }}}

 5. Together with `IsString`, these typeclasses are special when
 `OverloaddedStrings` is enabled, we will try to find an instance of the
 type which we are overloading: If we have a `"Foo" :: Foo`, we will try to
 find a instance for `Foo` among these classes, the priority of those
 instances can have an arbitrary order as long as we document it clearly.

 6. Once an instance is found, we do desugaring depending on the instance
 spec, and directly inject `fromXXX "xxx"#` into code, if the sourcecode
 codepoint can't be encoded with the instance spec, we issue a compile
 waring.

 7. If we failed to find an instance from above, we issue an compile error.

 8. Now a library author can choose to implement a type class which suit
 his/her need.

 9. This solution can also be extended to handle `OverloadedLists`, e.g.
 we can add following typeclass for desugaring list literals:

 {{{
 class IsIntList a where
     fromIntList :: ByteArray# -> a

 class IsWordList a where
     fromWordList :: ByteArray# -> a

 class IsInt8List a where
     fromInt8List :: ByteArray# -> a

 ...
 }}}

 10. When `OverloaddedLists` is enabled, we will try to find an instance of
 these special classes, and transform the list into `ByteArray#` according
 to the instance spec, if there're overflowing we issue warnings.


 If later, people ask for new format of literal desugaring, we add new
 typeclasses and done, old code continue to work, and new code will got a
 compile error on old compilers.


 BTW. I think this is what we called "解铃还需系铃人" in chinese ; )

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/5218#comment:82>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list