[Haskell-cafe] RFC: Unicode support in Alex
Jean-Philippe Bernardy
jeanphilippe.bernardy at gmail.com
Wed Jul 29 08:47:04 EDT 2009
Hello,
I have modified the Alex lexer generator to support unicode.
The general idea is that the state-machine works on the UTF8
representation of the text. I submit my work here for review
in order to off-load the maintainer (Simon Marlow) as far
as possible.
The prototype is available on github:
git://github.com/jyp/Alex.git
Be sure to
* checkout the "utf8" branch (so "git diff master" shows the changes)
* Do a 2-stage bootstrapping before testing
Caveats:
* The generated code depends on some utf8 packages;
* There is no attempt to fix the bytestring-based wrappers;
* Left-context recognition is not table-based any more;
* Presence of debug code.
Bug reports, comments, and especially patches are welcome :)
Thanks,
-- JP
More information about the Haskell-Cafe
mailing list