[GHC] #13486: inconsistency in handling the BOM Byte-order-mark in reading and putStrLn
GHC
ghc-devs at haskell.org
Sun Mar 26 10:04:06 UTC 2017
#13486: inconsistency in handling the BOM Byte-order-mark in reading and putStrLn
-------------------------------------+-------------------------------------
Reporter: andrewufrank | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 8.0.2
Resolution: | Keywords:
Operating System: Linux | Architecture:
Type of failure: Poor/confusing | Unknown/Multiple
error message | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Description changed by andrewufrank:
Old description:
> this is a very annoying issue and has been discussed already (e.g. #1744)
> and https://mail.haskell.org/pipermail/haskell-
> cafe/2011-January/088021.html.
>
> i think it is ok that the BOM character is not automatically removed when
> reading a file, but it is INCONSISTENT then to not show the BOM character
> when printing the file content.
>
> a minimal test:
>
> v <- readFile "fileWithBOM"
> putStrLn "the file content"
> putStrLn v
> putStrLn (show v)
>
> return ()
>
> the first line does not indicate that there is a BOM character in the
> input and not removed from the result - only the second putStrLn (with
> the incorrect show on the result string) demonstrates the presence of the
> BOM character:
>
> "\65279\r\n.sprache English\r\n\.....
>
> consistency here is important to warn the programmer early on (after
> reading and checking file content) because other tools (e.g. parsec) see
> the BOM character and fail.
>
> i recommend that the BOM character is read but shown in printStrLn - i
> guess this is preferably over automatic (silent) removal. reading in and
> not showing, however, leads to misguided searches for strange errors
> caused by the BOM.
New description:
this is a very annoying issue and has been discussed already (e.g. #1744)
and https://mail.haskell.org/pipermail/haskell-
cafe/2011-January/088021.html.
i think it is ok that the BOM character is not automatically removed when
reading a file, but it is INCONSISTENT then to not show the BOM character
when printing the file content.
a minimal test:
{{{
v <- readFile "fileWithBOM"
putStrLn "the file content"
putStrLn v
putStrLn (show v)
return ()
}}}
the first line does not indicate that there is a BOM character in the
input and not removed from the result - only the second putStrLn (with the
incorrect show on the result string) demonstrates the presence of the BOM
character:
"\65279\r\n.sprache English\r\n\.....
consistency here is important to warn the programmer early on (after
reading and checking file content) because other tools (e.g. parsec) see
the BOM character and fail.
i recommend that the BOM character is read but shown in printStrLn - i
guess this is preferably over automatic (silent) removal. reading in and
not showing, however, leads to misguided searches for strange errors
caused by the BOM.
--
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/13486#comment:1>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list