Proposal #3337: expose Unicode and newline translation from
duncan.coutts at worc.ox.ac.uk
Thu Jul 2 18:35:20 EDT 2009
On Tue, 2009-06-30 at 13:03 +0100, Simon Marlow wrote:
> For the proposed new additions, see:
> * http://www.haskell.org/~simonmar/base/System-IO.html#23
> System.IO (Unicode encoding/decoding)
> * http://www.haskell.org/~simonmar/base/System-IO.html#25
> System.IO (Newline conversion)
> Discussion period: 2 weeks (14 July).
A couple things we brought up at the ghc irc meeting yesterday:
* UTF-8 with or without BOM? or variants utf8_bom. Do we need all three
(pass through bom, produce no bom) -- raw utf8
(accept and ignore bom, produce bom) -- utf8 with bom
(accept and ignore bom, produce no bom) -- permissive
After thinking about it a bit, I think we can get away with just the
existing utf8 and a utf8_bom that accepts a bom and produces a bom. The
reason is that to get the third behaviour you just read with utf8_bom
and write with utf8. Most operations on text files are read or write of
the whole file, not read/write on a single file.
* For the moment we are not publicly exposing the TextEncoding type.
Later we may want to consider making TextEncoding pure (using ST) and
share it for pure conversions String/Text <-> ByteString.
More information about the Libraries