Proposal #3337: expose Unicode and newline translation from System.IO

Duncan Coutts duncan.coutts at worc.ox.ac.uk
Thu Jul 2 18:35:20 EDT 2009


On Tue, 2009-06-30 at 13:03 +0100, Simon Marlow wrote:
> Ticket:
> 
>    http://hackage.haskell.org/trac/ghc/ticket/3337
> 
> For the proposed new additions, see:
> 
>   * http://www.haskell.org/~simonmar/base/System-IO.html#23
>     System.IO (Unicode encoding/decoding)
> 
>   * http://www.haskell.org/~simonmar/base/System-IO.html#25
>     System.IO (Newline conversion)
> 
> Discussion period: 2 weeks (14 July).

A couple things we brought up at the ghc irc meeting yesterday:

* UTF-8 with or without BOM? or variants utf8_bom. Do we need all three
variants:
   (pass through bom, produce no bom)       -- raw utf8
   (accept and ignore bom, produce bom)     -- utf8 with bom
   (accept and ignore bom, produce no bom)  -- permissive

After thinking about it a bit, I think we can get away with just the
existing utf8 and a utf8_bom that accepts a bom and produces a bom. The
reason is that to get the third behaviour you just read with utf8_bom
and write with utf8. Most operations on text files are read or write of
the whole file, not read/write on a single file.

* For the moment we are not publicly exposing the TextEncoding type.
Later we may want to consider making TextEncoding pure (using ST) and
share it for pure conversions String/Text <-> ByteString.

Duncan



More information about the Libraries mailing list