[GHC] #10907: GHC fails to read file with byte-order mark when LANG=C
GHC
ghc-devs at haskell.org
Wed Sep 23 07:41:48 UTC 2015
#10907: GHC fails to read file with byte-order mark when LANG=C
-------------------------------------+-------------------------------------
Reporter: RyanGlScott | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.10.2
(Parser) |
Resolution: | Keywords:
Operating System: Linux | Architecture: x86_64
Type of failure: GHC doesn't work | (amd64)
at all | Test Case:
Blocked By: | Blocking:
Related Tickets: #6016, #6037 | Differential Revisions:
-------------------------------------+-------------------------------------
Comment (by nomeata):
The problem seems to be `skipBOM` in `StringUtils.hs`, which switches to
text mode so that `hLookAhead` is able to consume the whole BOM, instead
of just the first character. But in text mode we are locale dependent.
At first I thought it would make sense to stay in binary mode, but then
`hLookAhead` returns just one bytes, which is not enough to detect a bom.
Using `hGetChar` twice would help, but if there is no BOM, we’d have to
rewind. Are we sure we can `hSeek` on all buffers that we need to?
A `Word16` encoding would help. Or maybe it works well enough to force
utf8 for this single `hLookAhead`.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/10907#comment:6>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list