patch applied (cabal): First pass at parsing .cabal files as UTF8

Ross Paterson ross at soi.city.ac.uk
Mon Feb 25 16:49:09 EST 2008


On Mon, Feb 25, 2008 at 09:07:08PM +0000, Duncan Coutts wrote:
> It's no use pretending that readFile returns Unicode, it just doesn't
> (except on Hugs which does it properly). GHC is not going to catch up on
> this any time soon.

On the contrary, it's the only way to stay sane.  readFile does return
Unicode, it just doesn't read UTF.  Putting compensating bugs in the
libraries is only going to make it harder for GHC to change.

> If we open the files in binary mode we don't get the cr/lf line
> conversion on Windows and we'd have to do that ourselves. Perhaps that's
> the way to go.

I think we've been ignoring CRs in .cabal files ever since we had to
deal with tar files built on Windows and unpacked on Unix.

> As for stdout/stderr we're just stuffed. We cannot reopen them in binary
> mode and hugs and ghc have different and incompatible behaviour. We
> either end up double encoding with hugs or not decoding with ghc. There
> is no single method that works with both. We'd have to switch on the
> system in use.

My suggestion is to just write Chars to these Handles, even though text
handles in GHC currently only work in an ISO-8859-1 locale.  That's what
the other libraries in your program will be doing with those handles,
and they're not wrong: the other way lies madness.

Is switching the standard text handles to UTF really an impossibly
remote prospect?



More information about the cabal-devel mailing list