[Haskell] non-ASCII characters in Haddock documentation

Simon Marlow simonmar at microsoft.com
Mon Feb 16 12:52:01 EST 2004

> On Mon, Feb 16, 2004 at 10:20:30AM -0000, Simon Marlow wrote:
> > Wolfgang Jeltsch <wolfgang at jeltsch.net> writes:
> > > I meant non-ASCII characters in source code comments like this:
> > >     {-|
> > >         The execution time of this function is /n³/.
> > >     -}
> > > Currently, Haddock seems to copy the bytes making up the 
> > > non-ASCII character 
> > > verbatim to the HTML file.  But since the HTML file 
> doesn't contain a 
> > > character set specification, it is illformed and it depends 
> > > on the browser how this situation is handled.
> > 
> > It shouldn't be too hard to fix this, at least for Latin-1 (full
> > Unicode would be somewhat harder).  I'll add it to the TODO list.
> While Haskell's source charset is specified as Unicode, Haskell source
> files don't specify the byte encoding they use, so any source 
> file using non-ASCII characters isn't portable.  Entrenching Latin-1 
> would make the move to Unicode more difficult.

True, but GHC currently assumes Latin-1 as the encoding for source files.    I don't see it as entrenching Latin-1, just that we only accept Latin-1 encoded source files; at some point in the future we might accept other encodings.  Making the same simplifying assumption in Haddock doesn't seem that big a deal.


More information about the Haskell mailing list