[Haskell-cafe] Re: strange performance of expression evaluators

Sat Jan 13 07:40:30 EST 2007

> I've done some more experiments. The following program defines simple
> arithmetic expression with indexed variables. I've written four
> different ways to evaluate them:
>  - eval1 is simple monadic evaluator
>  - eval2 is the obvious straight-forward implentation
>  - compile1 is attempt to perform compilation
>  - compile2 is refined compile1 with sub-expressions explicitely
>      separated via "let" binding.
> 
> Test evaluates the same expression in 1000000 different environments.
> The results are:
>   - eval1 - 17.47 sec
>   - eval2 - 3.71 sec
>   - compile1 - 3.79 sec
>   - compile2 - 3.74 sec
> 
> This figures are completely mysterious for me.
>    1) I expected eval1 and eval2 to perform equally. In fact, eval1 is
>        4.7 times slower for eval2.
>    2) I expected compile{1,2} to perform much faster then eval{1,2}.
>       However, the "compilation" attempt does not give any speed up at
>       all.

Your intention is that (compile2 test) should analyze the expression
tree of (test) only once when evaluating it for different environments.

I'm not sure whether the constructors (Add), (Mul) etc. get replaced
once and for all by (+) and (*) or whether this really matters, because
(eval2), (compile1) and (compile2) have the same running time. I think
that memoization (as explained in my previous post) only takes place for
values not of function type, i.e. partially evaluated functions aren't
memoized. It may also be that the compiler optimizes things for the
concrete expression (test) you gave in your code. So supplying the
expression interactively could show a difference between (eval2),
(compile1) and (compile2).

Ironically, (eval1) does "compile" as much as you intend (compile2) to
do. But it looks like the overhead imposed by appealing to
Control.Monad.Reader doesn't get inlined away completely.

Currently, you don't do much work per expression, it just gets
evaluated. To take advantage of memoization, you need to do more
"expensive" analysis on a per expression basis. For example, you might
want to precalculate stuff that doesn't depend on the environment:

   data ConstVar a = Const a | Var (Env -> a)

   eval :: ConstVar a -> Env -> a
   eval (Const x) = const x
   eval (Var f)   = f

   -- addition, multiplication etc. do precalculation
   -- when the constituents are known beforehand
   instance Num a => ConstVar a where
       (Const x) + (Const y) = Const (x + y)
       x + y = Var (\e -> eval x e + eval y e)
       ...

   data Expr a = Value a | Variable Name
               | Add (Expr a) (Expr a) | Mul (Expr a) (Expr a)

   compile :: Num a => Expr a -> ConstVar a
   compile (Value c)    = Const c
   compile (Variable v) = Var (\e -> e ! v)
   compile (Add x y)    = (compile x) + (compile y)
   compile (Mul x y)    = (compile x) * (compile y)

   testexpr = (Mul (Value 1) (Value 2)) `Add` (Variable 1)
   test = eval . compile $ testexpr

Of course, this can be improved. For instance, it currently does not
know about the associative law like in

    (Add (Value 1) (Add (Value 2) (Variable 1)))

Now, it is clear that analyzing the expression again and again every
time it needs to be evaluated ("interpretation") is wasted work.

Regards,
apfelmus

PS:

> data Expr = Const !Value | Var !Int
>           | Add !Expr !Expr | Sub !Expr !Expr | Mul !Expr !Expr

You'd better leave out the strictness annotations (!).