[Haskell-cafe] Tiger compiler: variable escaping analysis phase
dagit at codersbase.com
Thu Jun 24 14:26:01 EDT 2010
2010/6/22 José Romildo Malaquias <j.romildo at gmail.com>
> I have been teaching an introductory course on compiler construction to
> our undergraduates students using Appel's "Modern Compiler
> Implementation in Java". There are also versions of the book in ML and
> C. The books explain how to write a compiler for the Tiger programming
> Now I want to implement a Tiger compiler in Haskell.
> The lexer and parser (built with the help of alex and happy) and the
> type checker are already working.
> Next step is implementing a variable escape analysis phase, needed for
> the intermediate code generator. The objective of this phase is to find
> out if each variable escapes or not. In general an escaping variable
> should be allocated in the stack frame (in memory). Non escaping
> variables may be allocated in the stack frame or in a register.
> Generally, a variable escapes if it is passed by reference, its address
> is taken (using C's & operator, for instance), or it is accessed from a
> nested function. Only the last is possible with Tiger.
> The approach adopted by Appel in his books is easy: a muttable field in
> the abstract syntax of variable declarations, of for expressions, and of
> function formal parameters, which introduce new variables, is used to
> collect the escaping information of the variable. In the ML version of
> the book this is a reference to boolean (bool ref). Initially, in a
> conservative approach, the reference is initialized to true.
> In the variable escaping analysis phase of the compiler, a function
> findEscape looks for escaping variables and record this information in
> the escape fields of the abstract syntax. To do this the entire abstract
> syntax tree is traversed looking for escaping uses of every variable,
> and, when found, the field is set to true.
> findEscape uses a symbol table to accomplish its work, binding variable
> names to a reference to boolean (the same reference used in the abstract
> syntax tree). When processing a variable declaraction, for instance, it
> inserts the variable name into the symbol table, binding it to its
> escaping field. When processing an expression which is a simple
> variable, if the variable occurs in a nested function, its binding in
> the symbol table is set to true. This reflects directly in the abstract
> syntax tree of the variable declaration, as the escape field in the
> variable declaration and the binding in the symbol table are the same
> reference to bool.
> I am look for good ways to implement the variable escaping analysis
> phase in Haskell, which is a pure language. I see two alternatives:
> 1) adopt the same approach as Appel used in his ML version of the
> compiler: use a mutable reference in the IO monad (Data.IORef) to
> hold the variable escaping information, and write everything inside
> the IO monad
> 2) build a new abstract syntax tree with updated information regarding
> variable escaping
> The second option is more elegant in my point of view, but would be much
> less efficient than the first one.
In Haskell you will get a lot of sharing between the original AST and the
updated AST. This greatly reduces the overhead of #2. Have you read Chris
Okasaki's Purely Functional Datastructures book? My other bit of advice
would be to code it in the purely functional way and then do some
benchmarking / profiling to see that it does have sufficiently good
performance characteristics. If you find that it's getting awkward to
manage the purely functional code then refactor that code into a monad that
hides the tedious bits.
I had some other comments to make but others here have already covered it :)
(Such as using Writer or State or ST instead of IO, etc).
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Haskell-Cafe