[Template-haskell] FW: Partial evaluation using TH

Wed Dec 31 17:31:54 EST 2003

I'm forwarding this rather interesting message to the TH mailing list

Simon

-----Original Message-----
From: Duncan Coutts [mailto:duncan.coutts at worcester.oxford.ac.uk] 
Sent: 19 December 2003 22:53
To: Ian Lynagh; Simon Peyton-Jones
Subject: Partial evaluation using TH

Hi Ian, Simon,

I've been playing with partial evaluation and am looking at doing
partial evaluation for Haskell using Template Haskell. I've come across
an issue with which I was hoping you might help me to understand what is
going on.

I've got a working (trivial) self-applicable partial evaluator using TH.
It is a trivial specialiser in the sense that it doesn't do any
specialisation, but I'm just looking at the feasibility at the moment. 

A bit of background; people building partial evaluators like to do this
trick where they specialise the specialiser with respect to itself. If
the specialiser function is called mix, they do:
cogen = mix mix mix
This works fine if your in an untyped language. In Haskell you need to
use encodings (of encodings) of programs to get a sensible type.

Before launching into the TH version, I've got a typed model which is
quite useful for seeing what is going on.

We model TH's AST data type as a data type with a phantom type parameter
which is the type of the value that the AST represents

> data Exp t = ...

Now we model the |[ ]| and $() operators as functions:
> spliceIn :: Exp t -> t
> spliceOut :: t -> Exp t

Obviously, we can't define these functions. We also have to remember to
use them with the same syntactic restrictions as their TH counterparts.

Now mix; it takes an expression representing a function and applies it
to its static argument.

> mix :: Exp (a -> b) -> Exp a -> Exp b

We will also need encodings of mix
> mix' = spliceOut mix
> mix'' = spliceOut mix'

So the self application works like so
> cogen :: Exp (Exp (a -> b)) -> Exp (Exp a -> Exp b)
> cogen = spliceIn(mix mix' mix'')

The double encoding on the LHS concerns me somewhat, I'll come back to
that.

We can now use this function to build a compiler for an embeded language
given an interpreter for that language. Suppose I've got some trivial
imperative language and an interpreter for it
(evalProg takes a Prog AST an initial value assignment of variables and
returns the result of running the program)
> evalProg :: Prog -> State -> Value

Then a compiler function for this language can be defined as
> comp :: Exp Prog -> Exp (State -> Value)
> comp = spliceIn (cogen (spliceOut (spliceOut evalProg)))

Note the use of the double encoding.

Finally, the compiler function can be used to compile a program in this
trivial language into a Haskell function:

> prog :: Prog
> prog = [Return (Var "a")]

> target :: State -> Value
> target = spliceIn (comp (spliceOut prog))

We can translate this in a straightforward way into TH. For this brief
presentation I'll pretend we don't have any staging restrictions (in
reality it is spread over 6 modules).

> mix :: ExpQ -> ExpQ -> ExpQ
> mix f x = f `app` x			--trivial specialiser

> mix', mix'' :: ExpQ
> mix'  = [| mix |]
> mix'' = [| mix' |]

> cogen :: ExpQ -> ExpQ
> cogen = $(mix mix' mix'')

> evalProg' = [| evalProg |]		--same evalProg as before

> compileProg :: ExpQ -> ExpQ
> compileProg = $( cogen [| evalProg' |] )

Again, note the double encoding of evalProg.

> prog :: Prog
> prog = [Return (Var "a")]

> target :: State -> Value
> target = $(compileProg [| prog |])

And it works:
*THTest> target [("a", 5)]
5

So, the thing that concerns me is the double encoding. I can't
intuitively see the need for it and it is certianly a pain to have to do
whenever one uses cogen. On the other hand, there's no real performance
overhead due to the double encoding (as there is usually when
self-applying partial evaluators in typed languages) because if we look
at evalProg' and evalProg'' we see:
   evalProg'    = return $ Var "Interp:evalProg"
[| evalProg' |] = return $ Var "Interp1:evalProg'"
(in contrast, in normal typed partial evaluators you can end up with a
massive amount of extra storage taken up by the double encoding)

I'm wondering if there is any way to write a function
Exp t -> Exp (Exp t)
So that the result must be spliced in twice to get a value of type t
(Again, I'm using a phantom type to indicate what is going on.)

That way one could use cogen:
compileProg = $( cogen [| evalProg |] )
rather than
compileProg = $( cogen [| [| evalProg |] |] )
(I know you can't nest [||], but that's sort of what we're doing)

Any ideas? Why does the double encoding occur? Does it make any sense in
TH? Can I elimiate/hide it?

Duncan

-- 
Duncan Coutts, BA
Probationary Research Student in Computer Science
University of Oxford