patch applied (cabal): First pass at parsing .cabal files as UTF8

Duncan Coutts duncan.coutts at worc.ox.ac.uk
Tue Feb 26 21:32:53 EST 2008


On Mon, 2008-02-25 at 21:49 +0000, Ross Paterson wrote:
> On Mon, Feb 25, 2008 at 09:07:08PM +0000, Duncan Coutts wrote:
> > It's no use pretending that readFile returns Unicode, it just doesn't
> > (except on Hugs which does it properly). GHC is not going to catch up on
> > this any time soon.
> 
> On the contrary, it's the only way to stay sane.  readFile does return
> Unicode, it just doesn't read UTF.  Putting compensating bugs in the
> libraries is only going to make it harder for GHC to change.

> My suggestion is to just write Chars to these Handles, even though text
> handles in GHC currently only work in an ISO-8859-1 locale.  That's what
> the other libraries in your program will be doing with those handles,
> and they're not wrong: the other way lies madness.

So that's basically what I've done in the most recent patches. I pretend
that read/writeFile and putStr etc work for text in the current locale
encoding. For files we know specifically are UTF8 because we declare
that to be the case (like .cabal and .hs) we now use to/fromUTF8 and
openBinaryFile.

Hmm, having said that we're not yet treating line endings in .hs files
correctly on windows. Sigh.

> Is switching the standard text handles to UTF really an impossibly
> remote prospect?

Seems not :-)

Duncan



More information about the cabal-devel mailing list