[Haskell-cafe] Re: Properties of optimizer rule application?

Henning Thielemann lemming at henning-thielemann.de
Wed Jan 16 15:56:53 EST 2008

On Wed, 16 Jan 2008, Roman Leshchinskiy wrote:

> Henning Thielemann wrote:
> > Reading various papers and the Wiki about GHC optimizer rules I got the
> > impression that there are not much properties I can rely on and I wonder
> > how I can write a reliable fusion framework with this constraint.
> That depends on your definition of reliable. You can't have a framework
> which fuses everything that can be fused but then, I don't think that's
> even theoretically possible.

At least I expect that it fuses greedily and does not stop as long as
rules are applicable. Thinking of intermediate fusable function
replacements, I can be sure that rules are invoked that prevent me from
making things worse by optimization attempts.

> >  I read about the strategy to replace functions early by fusable
> > implementations and replace them back to fast low-level implementation if
> > fusion was not possible. However, can I rely on the back-translation if I
> > have no warranty that the corresponding rule is applied? Is there some
> > warranty that rules are applied as long as applicable rules are available
> > or is the optimizer free to decide that it worked enough for today?
> >  I see several phases with a fixed number of iterations in the output of
> > -ddump-simpl-iterations. Is there some idea behind these phases or is the
> > structure and number rather arbitrary? If there is only a fixed number of
> > simplifier runs, how can I rely on complete fusion of arbitrary large
> > expressions?
> In general, you can't.

To give a precise example: If I have a sequence of 'map's
  map f0 . map f1 . ... . map fn
 then there is some length where this is no longer collapsed to a single
'map'? However then I wonder, how it is possible to make the compiler to
go into an infinite loop by the rule

   "loop"   forall x,y.  f x y = f y x

 as stated in the GHC manual:

 I'm still uncertain how much is done in one iteration in one phase, since
there seems to be several rules that can fire in one iteration.

> You can control the number of simplifier phases with -fsimplifier-phases
> (in the HEAD only) and the number of iterations in each phase with
> -fmax-simplifier-iterations.

Good to know.

> That said, there are other things that break fusion (such as code
> getting between two functions you want to fuse). Again, you can only try
> to make your framework good enough; it'll never be perfect.

It would be nice to have a flag which alters the rule application order of
the compiler randomly in order to see whether the fusion framework
implicitly relies on a particular behaviour of the current compiler

> >  Another text passage tells that the simplification is inside-out
> > expressions. Such a direction would make the design of rules definitely
> > easier. Having both directions, maybe alternating in the runs of the
> > simplifier, would be also nice. I could then design transforms of the
> > kind:
>  >   toFastStructure . slowA . slowB . slowC . slowWithNoFastCounterpart
>  >   fastA . toFastStructure . slowB . slowC . slowWithNoFastCounterpart
>  >   fastA . fastB . toFastStructure . slowC . slowWithNoFastCounterpart
>  >   fastA . fastB . fastC . toFastStructure . slowWithNoFastCounterpart
>  >   fastA . fastBC . toFastStructure . slowWithNoFastCounterpart
>  >   fastABC . toFastStructure . slowWithNoFastCounterpart
> Again, I don't think you really want to rely on the order of
> simplification. For your example, you only need the following rules:
> toFastStructure (slow{A|B|C} x) = fast{A|B|C} (toFastStructure x)
> fastB (fastC x) = fastBC x
> fastA (fastBC x) = fastABC x
> They do not require any specific traversal order.

Ok, this was a bad example. Try this one:
   project . project . foo
 with the rules
   project (project x) = project x
   project (foo     x) = projectFoo x

Both rules can be applied to the expression, but you get one fusion more,
if you use the first one first. Let me guess, in order to solve that, I
should restrict the first rule to an earlier phase than the second rule.

Thanks for the detailed answer and thanks to the busy people who have
created the optimizer and who have written all the papers and Wiki pages
for making use of this feature. I don't know another language where it is
possible to control the optimizer in this way.

More information about the Haskell-Cafe mailing list