[Haskell-cafe] Compiler's bane

Fri Aug 29 16:12:48 EDT 2008

On Fri, 2008-08-29 at 20:56 +0100, Andrew Coppin wrote:
> Neil Mitchell wrote:
> > Hi
> >
> >   
> >> I'm writing a simple interpretter for a small extended-lambda-calculus sort
> >> of language. And I'd just like to say... RECURSIVE LET-BINDS! GAAAAH!!! >_<
> >>     
> >
> > Agreed :-)
> >   
> 
> I'm glad it's not just me! ;-)
> 
> (Oleg cat sez: "see, yr type problum not so hard".)
> 
> >> To illustrate:
> >>
> >>  let x = f x; y = 5 in x y
> >>
> >> A simple-minded interpretter might try to replace every occurrance of "x"
> >> with "f x". This yields
> >>
> >>  let y = 5 in (f x) y
> >>     
> >
> > That's wrong, its a two step transformation. You just substitute all
> > occurances of x for f x:
> >
> > let x = f (f x); y = 5 in (f x) y
> >
> > For the case let x = 5 in x, you do the same thing:
> >
> > let x = 5 in 5
> >
> > Now as a second step you hoover up unused let bindings and disguard:
> >
> > 5
> >
> > You seem to be combining the substitution and the removing of dead let
> > bindings, if you separate them you should have more luck.
> >   
> 
> Right. So substitute in the RHS and then perform a let-cull afterwards. 
> Got it.
> 
> I've been playing with this today, and no matter what rules I come up 
> with, it seems to be ridiculously easy to put the interpretter into an 
> infinite loop from which it will never escape. Is there any way to know 
> which binds are "worth" expanding and which ones aren't? (I have a 
> sinking feeling this might be equivilent to the Halting Problem...)
> 
> For example, if you have "let x = f y; y = g x in x" then since all the 
> functions are free variables, there's really nothing much "useful" you 
> can do to simplify this any further. However, my interpretter merrily 
> goes into an endless loop building a huge expression like "f (g (f (g (f 
> (g..." Is it possible to avoid this?

If you want to avoid infinite normal forms (or just plain lack of normal
forms, as in let x = x in x or (\x -> x x) (\ x -> x x)), you need to
follow three rules:

1) Static typing
2) No value recursion
3) If you allow data types, no recursive data types

Otherwise, your language will be Turing-complete, so yes, trying to
determine ahead of time if even a HNF exists will be the halting
problem.

If you really want to write a general-purpose evaluator, then infinite
normal forms may not be an issue, as long as they are generated lazily,
so your evaluator can at least print something out while it goes into an
infinite loop.  Otherwise, you can declare f x, where f is an unknown
function, a HNF, and stop at that point, or go on only under the
client's control.

The optimizers used by GHC and YHC pick one function out of each
recursive binding group, and just refuse to inline it at all.  If you
don't mind producing expressions that aren't really in any definable
HNF, that would be an alternative as well.

jcc