[GHC] #15525: Unicode 8.0 and later characters are invariably lexical errors

Wed Aug 15 19:27:37 UTC 2018

#15525: Unicode 8.0 and later characters are invariably lexical errors
-------------------------------------+-------------------------------------
           Reporter:  ChaiTRex       |             Owner:  (none)
               Type:  bug            |            Status:  new
           Priority:  normal         |         Milestone:  8.6.1
          Component:  Compiler       |           Version:  8.4.3
  (Parser)                           |
           Keywords:                 |  Operating System:  Unknown/Multiple
       Architecture:                 |   Type of failure:  None/Unknown
  Unknown/Multiple                   |
          Test Case:                 |        Blocked By:
           Blocking:                 |   Related Tickets:
Differential Rev(s):                 |         Wiki Page:
-------------------------------------+-------------------------------------
 I've tried a few added alphabet characters and emojis from various Unicode
 versions. It seems like Unicode 7.0 works fine. It seems like characters
 from Unicode 8.0 and later are lexical errors.

 For example, with the Unicode 10.0 [https://emojipedia.org/t-rex/ T. rex
 emoji], there are three lexical errors below:

 {{{#!hs
 module NoTRex where

 tRex :: String
 tRex = "🦖"

 🦖 :: String
 🦖 = "🦖"
 }}}

 produces:

 {{{
 [1 of 1] Compiling NoTRex           ( NoTRex.hs, NoTRex.o )

 NoTRex.hs:4:9: error:
     lexical error in string/character literal at character '\129430'
   |
 4 | tRex = "🦖"
   |         ^
 }}}

 If that's removed, the name of the function `🦖` is also shown to be a
 lexical error.

 Also, pasting the fourth line into GHCi pastes only the characters before
 the first `🦖`, like the `🦖` and everything afterward weren't pasted in.

 ----

 System information:

 {{{
 $ ghc --version
 The Glorious Glasgow Haskell Compilation System, version 8.4.3

 $ lsb_release -ds
 Ubuntu 16.04.5 LTS
 }}}

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15525>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler