UTF-8 decoding error
Simon Marlow
simonmarhaskell at gmail.com
Fri Feb 3 10:29:20 EST 2006
Christian Maeder wrote:
>> So - do you need Latin-1, or could you use UTF-8?
>
>
> I'm not amused to change the encoding of many haskell source files
> (particular of those that are not mine).
Fair enough, but there will have to be some way to specify the encoding,
either via a pragma, command-line option, or the locale. I'm really not
sure what is the best choice here. Perhaps all three, with locale being
the default, overriden by pragmas and command-line options.
The easiest way for us to handle encodings other than UTF-8 is for it to
be a new preprocessing step, running 'iconv'. (but what do we do on
Windows? bundle iconv? ew.)
John - what do you plan to do here?
> These files can then no longer be compiled by earlier ghcs (though I
> don't understand, how ghc-6.4.1 recognises the lexical error).
>
> I'm tempted to replace "ä" bei "\228" in literals. What does haddock do
> with utf-8 in comments? Will DrIFT -- using read- and writeFile -- still
> work correctly?
Haddock needs to be updated too. But if GHC implements recoding via
iconv, you can use GHC as a preprocesor to recode back to Latin-1; since
you have to use GHC as a preprocessor with Haddock anyway, this
shouldn't be much harder (of course, if you use non-Latin-1 characters
this fails). Eventually, when Haddock runs on top of GHC, the issue
will go away :)
I don't know about DrIFT.
Cheers,
Simon
More information about the Glasgow-haskell-users
mailing list