[Haskell-cafe] UTF-8 BOM
Tony Morris
tonymorris at gmail.com
Wed Jan 5 02:08:22 CET 2011
I am reading files with System.IO.readFile. Some of these files start
with a UTF-8 Byte Order Marker (0xef 0xbb 0xbf). For some functions that
process this String, this causes choking so I drop the BOM as shown
below. This feels particularly hacky, but I am not in control of many of
these functions (that perhaps could use ByteString with a better solution).
I'm wondering if there is a better way of achieving this goal. Thanks
for any tips.
dropBOM ::
String
-> String
dropBOM [] =
[]
dropBOM s@(x:xs) =
let unicodeMarker = '\65279' -- UTF-8 BOM
in if x == unicodeMarker then xs else s
readBOMFile ::
FilePath
-> IO String
readBOMFile p =
dropBOM `fmap` readFile p
--
Tony Morris
http://tmorris.net/
More information about the Haskell-Cafe
mailing list