[Haskell-cafe] Writing a compiler in Hakell

Jason Dagit dagit at codersbase.com
Wed May 6 02:50:51 EDT 2009


On Tue, May 5, 2009 at 11:07 PM, Rouan van Dalen <rvdalen at yahoo.co.uk> wrote:
>
> Hi everyone.
>
> I am designing my own programming language.
>
> I would like to know what is the best way to go about writing my compiler in haskell.
> What are the tools available in haskell that can help with compiler construction?

I think GHC itself uses Alex and Happy for tokenizing and parsing
respectively.  Parsec is a nice parser combinator library.  Another
tool you could employ is Higher Order Abstract Syntax (more of an
idiom), and there are papers about it if you google for them.  It's a
way of representing the program you want to compile and doing
transformations on it.

> I know about Happy.  Is that a good tool to use?

My understanding is that GHC uses it.  I think they are happy with
happy, but maybe a GHC dev could comment.

> The compiler is intended for serious use and I would like it to be very efficient, maybe competing
> with compilers written in C.  It should also be very easy to extend as the languoge grows.

I think one of the biggest slow downs for compilers is the complexity
of parsing and the complexity of the compilation.  For example, If you
pick a grammar that parses efficiently and do less sophisticated
analyses involving a single pass and so on it will be fast to compile.
 The draw back is that the compiled programs may execute more slowly
than if you did lots of analysis or optimization.  As an example of
what not to do, C++ has a notoriously complex to parse grammar along
with some other issues that make it a slow compiling language :)

> Are there any good books that you can recommend on compiler construction in general and specific to haskell?

I like "Modern Compiler Design" myself:
http://www.amazon.com/Modern-Compiler-Design-D-Grune/dp/0471976970

I would also recommend reading the haskell report to get an idea of
what a friendly language specification looks like:
http://www.haskell.org/onlinereport/

One point of advice I would give is defining your language as an
embedded language in Haskell first.  You can start with the data type
that holds your intermediate representation or abstract syntax.  Then
you can work on either code generation or evaluation.  As you start to
settle on your data structures and the language primitives you can
begin working on a parser.  In a sense, the parser is the last part
you need to finish.  I've found this approach to be quite
accommodating and to also save me a lot of time.  If you google for
it, there is a fair bit of literature about domain specific and
embedded domain specific languages in Haskell.  It's quite a good
language for prototyping this way.

Once you write some code you'll discover that some things are taking
too long.  In that case, you'll want to learn how to make proper use
of the GHC profiler, assuming you use GHC as your haskell compiler.
It's quite powerful and by using it correctly you can zoom in on the
slow spots and dramatically improve the performance of your compiler.

I found this paper to be a nice way to get started:
http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf

One of the brilliant bits in that paper is that it teaches you to use
an existing compiler (gcc) to generate code and then learn how to
implement things by example!  I highly recommend reading it.  It's
written about a scheme compiler, but it applies equally well to other
languages.

Good luck!
Jason


More information about the Haskell-Cafe mailing list