H98 Text IO

Duncan Coutts duncan.coutts at worc.ox.ac.uk
Tue Feb 26 08:07:50 EST 2008


On Tue, 2008-02-26 at 12:44 +0000, Ross Paterson wrote:
> On Tue, Feb 26, 2008 at 11:47:49AM +0000, Duncan Coutts wrote:
> > The major problem is with code that assumes GHC's Handles are
> > essentially Word8 and layer their own UTF8 or other decoding over the
> > top. The utf8-string package has this problem for example. Such code
> > should be using openBinaryFile because they are reading/writing binary
> > data, not String text.
> 
> As I was saying on cabal-devel, I think this distinction ought to be in
> the types, i.e. we need, in base, a type distinct from Handle that offers
> a Word8 interface to binary I/O, as a foundation for various experiments
> with encodings (which need not all be in base).

I agree. If we can come to a consensus on the interpretation of the H98
text Handles then the next step is to start a discussion on a standard
binary IO system (and I'd certainly support using a different type of
Handle so we never mix up binary data and [Char]).

The main point of difference so far seems to be whether we pick a fixed
utf8 encoding or the the current locale encoding or some mixture
depending on the kind of IO object.

I think that's where we should focus the discussion initially.

It'd be nice if there was agreement between the different
implementations. It seems we're not far from agreement between at least
hugs, ghc and jhc.

Ross, perhaps you can put the argument for what hugs currently does -
always using the locale for all terminal an text file IO rather than
picking a fixed encoding.

Duncan



More information about the Libraries mailing list