[Git][ghc/ghc][master] Don't assume the current locale is *.UTF-8, set the encoding explicitly

Marge Bot (@marge-bot) gitlab at gitlab.haskell.org
Fri Nov 10 00:17:01 UTC 2023



Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC


Commits:
52c0fc69 by PHO at 2023-11-09T19:16:22-05:00
Don't assume the current locale is *.UTF-8, set the encoding explicitly

primops.txt contains Unicode characters:
> LC_ALL=C ./genprimopcode --data-decl < ./primops.txt
> genprimopcode: <stdin>: hGetContents: invalid argument (cannot decode byte sequence starting from 226)

Hadrian must also avoid using readFile' to read primops.txt because it
tries to decode the file with a locale-specific encoding.

- - - - -


2 changed files:

- hadrian/src/Builder.hs
- utils/genprimopcode/Main.hs


Changes:

=====================================
hadrian/src/Builder.hs
=====================================
@@ -333,8 +333,8 @@ instance H.Builder Builder where
                 GenApply -> captureStdout
 
                 GenPrimopCode -> do
-                    stdin <- readFile' input
-                    Stdout stdout <- cmd' (Stdin stdin) [path] buildArgs buildOptions
+                    need [input]
+                    Stdout stdout <- cmd' (FileStdin input) [path] buildArgs buildOptions
                     -- see Note [Capture stdout as a ByteString]
                     writeFileChangedBS output stdout
 


=====================================
utils/genprimopcode/Main.hs
=====================================
@@ -13,6 +13,7 @@ import Data.Char
 import Data.List (union, intersperse, intercalate, nub)
 import Data.Maybe ( catMaybes )
 import System.Environment ( getArgs )
+import System.IO ( hSetEncoding, stdin, stdout, utf8 )
 
 vecOptions :: Entry -> [(String,String,Int)]
 vecOptions i =
@@ -116,7 +117,9 @@ main = getArgs >>= \args ->
                    ++ unlines (map ("            "++) known_args)
                   )
        else
-       do s <- getContents
+       do hSetEncoding stdin  utf8 -- The input file is in UTF-8. Set the encoding explicitly.
+          hSetEncoding stdout utf8
+          s <- getContents
           case parse s of
              Left err -> error ("parse error at " ++ (show err))
              Right p_o_specs@(Info _ _)



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/52c0fc691e6501e99a96693ec1fc02e3c93a4fbc

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/52c0fc691e6501e99a96693ec1fc02e3c93a4fbc
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20231109/2ded8403/attachment-0001.html>


More information about the ghc-commits mailing list