[Haskell] non-ASCII characters in Haddock documentation
Simon Marlow
simonmar at microsoft.com
Mon Feb 16 12:52:01 EST 2004
> On Mon, Feb 16, 2004 at 10:20:30AM -0000, Simon Marlow wrote:
> > Wolfgang Jeltsch <wolfgang at jeltsch.net> writes:
> > > I meant non-ASCII characters in source code comments like this:
> > > {-|
> > > The execution time of this function is /n³/.
> > > -}
> > > Currently, Haddock seems to copy the bytes making up the
> > > non-ASCII character
> > > verbatim to the HTML file. But since the HTML file
> doesn't contain a
> > > character set specification, it is illformed and it depends
> > > on the browser how this situation is handled.
> >
> > It shouldn't be too hard to fix this, at least for Latin-1 (full
> > Unicode would be somewhat harder). I'll add it to the TODO list.
>
> While Haskell's source charset is specified as Unicode, Haskell source
> files don't specify the byte encoding they use, so any source
> file using non-ASCII characters isn't portable. Entrenching Latin-1
> would make the move to Unicode more difficult.
True, but GHC currently assumes Latin-1 as the encoding for source files. I don't see it as entrenching Latin-1, just that we only accept Latin-1 encoded source files; at some point in the future we might accept other encodings. Making the same simplifying assumption in Haddock doesn't seem that big a deal.
Cheers,
Simon
More information about the Haskell
mailing list