[GHC] #13486: inconsistency in handling the BOM Byte-order-mark in reading and putStrLn

GHC ghc-devs at haskell.org
Sun Mar 26 10:02:55 UTC 2017


#13486: inconsistency in handling the BOM Byte-order-mark in reading and putStrLn
-------------------------------------+-------------------------------------
           Reporter:  andrewufrank   |             Owner:  (none)
               Type:  bug            |            Status:  new
           Priority:  normal         |         Milestone:
          Component:  Compiler       |           Version:  8.0.2
           Keywords:                 |  Operating System:  Linux
       Architecture:                 |   Type of failure:  Poor/confusing
  Unknown/Multiple                   |  error message
          Test Case:                 |        Blocked By:
           Blocking:                 |   Related Tickets:
Differential Rev(s):                 |         Wiki Page:
-------------------------------------+-------------------------------------
 this is a very annoying issue and has been discussed already (e.g. #1744)
 and https://mail.haskell.org/pipermail/haskell-
 cafe/2011-January/088021.html.

 i think it is ok that the BOM character is not automatically removed when
 reading a file, but it is INCONSISTENT then to not show the BOM character
 when printing the file content.

 a minimal test:

     v <- readFile "fileWithBOM"
     putStrLn "the file content"
     putStrLn v
     putStrLn (show v)

     return ()

 the first line does not indicate that there is a BOM character in the
 input and not removed from the result - only the second putStrLn (with the
 incorrect show on the result string) demonstrates the presence of the BOM
 character:

 "\65279\r\n.sprache English\r\n\.....

 consistency here is important to warn the programmer early on (after
 reading and checking file content) because other tools (e.g. parsec) see
 the BOM character and fail.

 i recommend that the BOM character is read but shown in printStrLn - i
 guess this is preferably over automatic (silent) removal. reading in and
 not showing, however, leads to misguided searches for strange errors
 caused by the BOM.

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/13486>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list