[GHC] #15021: ghc-pkg list crashes on Windows when unicode character is in the path

GHC ghc-devs at haskell.org
Tue Apr 10 14:21:40 UTC 2018


#15021: ghc-pkg list crashes on Windows when unicode character is in the path
----------------------------------------+---------------------------------
           Reporter:  nh2               |             Owner:  (none)
               Type:  bug               |            Status:  new
           Priority:  normal            |         Milestone:  8.6.1
          Component:  ghc-pkg           |           Version:  8.2.2
           Keywords:                    |  Operating System:  Windows
       Architecture:  Unknown/Multiple  |   Type of failure:  None/Unknown
          Test Case:                    |        Blocked By:
           Blocking:                    |   Related Tickets:  #10762
Differential Rev(s):                    |         Wiki Page:
----------------------------------------+---------------------------------
 Below I will attach examples of how `ghc-pkg list` crashes with

 {{{
 <stdout>: commitBuffer: invalid argument (invalid character)
 }}}

 when the user name contains a non-ASCII character (in my case `日`). I
 don't think it's important what the user name is, but more likely whether
 or not `日` is in some path (probably the path to the package database).

 The issue is trivially reproducible (I tried on Windows 10, and Server
 2016 as deployed by AWS) with cmd.exe and PowerShell.

 In the below, `chcp 20127` sets the code page to US-ASCII, and `chcp
 65001`, sets it to UTF-8.

 This per-terminal / per-process setting is different from the system
 locale (as can be set [https://www.java.com/en/download/help/locale.xml
 this way]).

 The the behaviour of different combinations of system locale and code
 page:

 * English (US) system locale
   * `chcp 437` (English default): **crash**
   * `chcp 65001` (UTF-8): no crash
   * `chcp 20127` (ASCII): **crash**
 * Japanese system locale
   * `chcp 932` (Japanese default): no crash
   * `chcp 65001` (UTF-8): no crash
   * `chcp 20127` (ASCII): **crash**

 In none of these situatins should `ghc-pkg list` crash.

 It breaks all the build tools; from conversatins on the `haskell-jp`
 Slack, I have learned that Japanese users have learned to avoid unicode in
 their user names due to this issue.

 I believe the fix to `ghc-pkg` should be similar to
 [https://github.com/ghc/ghc/commit/1b56c40578374a15b4a2593895710c68b0e2a717
 this fix for `ghc`] (#10762), where the encoding of the stdout `Handle` is
 set to UTF-8.

 ----

 With English (US) system locale, as copy-pasted from PowerShell:

 {{{
 Windows PowerShell
 Copyright (C) 2016 Microsoft Corporation. All rights reserved.

 PS C:\Users\日> chcp
 Active code page: 437
 PS C:\Users\日> C:\Users\日
 \AppData\Local\Programs\stack\x86_64-windows\ghc-8.2.2\bin\ghc-pkg.EXE
 list
 C:\Users\ghc-pkg.EXE: <stdout>: commitBuffer: invalid argument (invalid
 character)
 PS C:\Users\日> chcp 65001
 Active code page: 65001
 PS C:\Users\日> C:\Users\日
 \AppData\Local\Programs\stack\x86_64-windows\ghc-8.2.2\bin\ghc-pkg.EXE
 list
 C:\Users\日
 \AppData\Local\Programs\stack\x86_64-windows\ghc-8.2.2\lib\package.conf.d
     Cabal-2.0.1.0
     Win32-2.5.4.1
     array-0.5.2.0
     base-4.10.1.0
     binary-0.8.5.1
     bytestring-0.10.8.2
     containers-0.5.10.2
     deepseq-1.4.3.0
     directory-1.3.0.2
     filepath-1.4.1.2
     (ghc-8.2.2)
     ghc-boot-8.2.2
     ghc-boot-th-8.2.2
     ghc-compact-0.1.0.0
     ghc-prim-0.5.1.1
     ghci-8.2.2
     haskeline-0.7.4.0
     hoopl-3.10.2.2
     hpc-0.6.0.3
     integer-gmp-1.0.1.0
     pretty-1.1.3.3
     process-1.6.1.0
     rts-1.0
     template-haskell-2.12.0.0
     time-1.8.0.2
     transformers-0.5.2.0
     xhtml-3000.2.2

 PS C:\Users\日> chcp 20127
 Active code page: 20127
 PS C:\Users\日> C:\Users\日
 \AppData\Local\Programs\stack\x86_64-windows\ghc-8.2.2\bin\ghc-pkg.EXE
 list
 C:\Users\ghc-pkg.EXE: <stdout>: commitBuffer: invalid argument (invalid
 character)
 }}}

 With Japanese system locale, as copied from PowerShell:

 {{{
 Windows PowerShell
 Copyright (C) 2016 Microsoft Corporation. All rights reserved.

 PS C:\Users\日> chcp
 Active code page: 932
 PS C:\Users\日> C:\Users\日
 \AppData\Local\Programs\stack\x86_64-windows\ghc-8.2.2\bin\ghc-pkg.exe
 list
 C:\Users\日
 \AppData\Local\Programs\stack\x86_64-windows\ghc-8.2.2\lib\package.conf.d
     Cabal-2.0.1.0
     Win32-2.5.4.1
     array-0.5.2.0
     base-4.10.1.0
     binary-0.8.5.1
     bytestring-0.10.8.2
     containers-0.5.10.2
     deepseq-1.4.3.0
     directory-1.3.0.2
     filepath-1.4.1.2
     (ghc-8.2.2)
     ghc-boot-8.2.2
     ghc-boot-th-8.2.2
     ghc-compact-0.1.0.0
     ghc-prim-0.5.1.1
     ghci-8.2.2
     haskeline-0.7.4.0
     hoopl-3.10.2.2
     hpc-0.6.0.3
     integer-gmp-1.0.1.0
     pretty-1.1.3.3
     process-1.6.1.0
     rts-1.0
     template-haskell-2.12.0.0
     time-1.8.0.2
     transformers-0.5.2.0
     xhtml-3000.2.2

 PS C:\Users\日> chcp 65001
 Active code page: 65001
 PS C:\Users\日> C:\Users\日
 \AppData\Local\Programs\stack\x86_64-windows\ghc-8.2.2\bin\ghc-pkg.exe
 list
 C:\Users\日
 \AppData\Local\Programs\stack\x86_64-windows\ghc-8.2.2\lib\package.conf.d
     Cabal-2.0.1.0
     Win32-2.5.4.1
     array-0.5.2.0
     base-4.10.1.0
     binary-0.8.5.1
     bytestring-0.10.8.2
     containers-0.5.10.2
     deepseq-1.4.3.0
     directory-1.3.0.2
     filepath-1.4.1.2
     (ghc-8.2.2)
     ghc-boot-8.2.2
     ghc-boot-th-8.2.2
     ghc-compact-0.1.0.0
     ghc-prim-0.5.1.1
     ghci-8.2.2
     haskeline-0.7.4.0
     hoopl-3.10.2.2
     hpc-0.6.0.3
     integer-gmp-1.0.1.0
     pretty-1.1.3.3
     process-1.6.1.0
     rts-1.0
     template-haskell-2.12.0.0
     time-1.8.0.2
     transformers-0.5.2.0
     xhtml-3000.2.2

 PS C:\Users\日> chcp 20127
 Active code page: 20127
 PS C:\Users\日> C:\Users\日
 \AppData\Local\Programs\stack\x86_64-windows\ghc-8.2.2\bin\ghc-pkg.exe
 list
 C:\Users\ghc-pkg.exe: <stdout>: commitBuffer: invalid argument (invalid
 character)
 }}}

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15021>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list