[Haskell-cafe] Alex Lexer Performance Issues

Richard O'Keefe ok at cs.otago.ac.nz
Mon Jun 27 01:16:13 CEST 2011

On 26/06/2011, at 10:19 PM, Malcolm Wallace wrote:
> There is an old folklore that lexing is usually the most expensive phase of any compiler-like traversal.  50% of time and space expended on lexing was pretty common twenty years ago.

Indeed it is old, but no, it isn't folklore, you'll find actual measurements in Per Brinch Hansen,
and no, what was pretty common twenty years ago is not necessarily valid today.

Depending on what you are compiling into what, you could be spending a lot of time reading,
a lot of time writing, or a lot of time doing semantic analysis and optimisation.  For
example, lexing C++ is clearly a linear time operation, but type checking C++ is Turing
complete (any computable function can be expressed as a C++ type checking problem).
There's a Scheme compiler I know of where semantic analysis is at least O(N**3), N being
the number of nodes in the AST.

Also, much depends on the level at which you are doing I/O.  Stuff layered on top of C stdio
can be amazingly slow (thanks in part to libraries that acquire and release a lock for 
every call to getc() or putc()).

More information about the Haskell-Cafe mailing list