Unicode support for Haddock

Max Bolingbroke batterseapower at hotmail.com
Sun Feb 3 20:07:54 CET 2013

Hi GHCers,

I recently ran into a problem where Haddock does not correctly handle
Unicode in doc comments. So for example with this file:

module Example where

-- | 好
ok :: Int -> Int
ok x = x

-- | 个
misinterp :: Int -> Int
misinterp _ = (-1)

-- | 漢
failure :: Int -> Int
failure x = x-1

Current versions of Haddock will output the documentation for "ok"
correctly, will output an empty bulleted list as the documentation for
"misinterp" and not output any documentation at all for "failure"
(echoing a warning to stderr instead).

This is kind of sad. There is a very old open ticket about this issue:
http://trac.haskell.org/haddock/ticket/20. The patches I've attached
to that ticket fix the problem by using the native Unicode support in
Alex 3. I've also attached to the ticket a patch which makes the
necessary changes to GHC's build system required to build this new
Haddock correctly.

Do these patches seem OK? Is it fine to insist on Alex 3? I think it
was released in 2011 so I think by now we can assume that it is
available on all machines that will want to build GHC.

If this patch is accepted, at some point we might want to think about
switching to Alex 3's unicode support in GHC's own lexer rather than
relying on the current hacks. My patches do not make any change along
those lines.


More information about the ghc-devs mailing list