[GHC] #13486: inconsistency in handling the BOM Byte-order-mark in reading and putStrLn
GHC
ghc-devs at haskell.org
Sun Mar 26 10:02:55 UTC 2017
#13486: inconsistency in handling the BOM Byte-order-mark in reading and putStrLn
-------------------------------------+-------------------------------------
Reporter: andrewufrank | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 8.0.2
Keywords: | Operating System: Linux
Architecture: | Type of failure: Poor/confusing
Unknown/Multiple | error message
Test Case: | Blocked By:
Blocking: | Related Tickets:
Differential Rev(s): | Wiki Page:
-------------------------------------+-------------------------------------
this is a very annoying issue and has been discussed already (e.g. #1744)
and https://mail.haskell.org/pipermail/haskell-
cafe/2011-January/088021.html.
i think it is ok that the BOM character is not automatically removed when
reading a file, but it is INCONSISTENT then to not show the BOM character
when printing the file content.
a minimal test:
v <- readFile "fileWithBOM"
putStrLn "the file content"
putStrLn v
putStrLn (show v)
return ()
the first line does not indicate that there is a BOM character in the
input and not removed from the result - only the second putStrLn (with the
incorrect show on the result string) demonstrates the presence of the BOM
character:
"\65279\r\n.sprache English\r\n\.....
consistency here is important to warn the programmer early on (after
reading and checking file content) because other tools (e.g. parsec) see
the BOM character and fail.
i recommend that the BOM character is read but shown in printStrLn - i
guess this is preferably over automatic (silent) removal. reading in and
not showing, however, leads to misguided searches for strange errors
caused by the BOM.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/13486>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list