[jhc] Grin.

Lemmih lemmih at gmail.com
Sun Dec 7 05:43:58 EST 2008

On Sun, Dec 7, 2008 at 11:26 AM, John Meacham <john at repetae.net> wrote:
> On Sat, Dec 06, 2008 at 08:23:48AM +0100, Lemmih wrote:
>> I've been looking at some grin output and it seems that tags have been
>> dropped in favour of indirect calls. That is, 'eval' now does an
>> indirect call instead of inspecting the node tag. Is there a reason
>> for this?
> There were a couple reasons that functionality was added, Mainly it came
> down to flexibility of implementation. It provides a straightforward
> path for separate compilation. It is not intended that indirect calls be
> used always for evaluation, it just provides a universal fall-back case
> when direct execution is not appropriate for various reasons. Although
> jhc is a whole program optimizer, some support for separate compilation
> would be very convinient, if nothing else it will be needed for dynamic
> code loading at run-time. Another reason had to do with the points-to
> analysis, I found myself "chasing my tail" a lot of the time, in the
> face of features such as arbitrary impredicativity and GADTs, it became
> harder and harder to ensure the points-to analysis would produce an
> accurate and useful result in all possible cases, being able to fall
> back on an indirect evaluation in those cases simplified things
> dramatically. improving the front end, and improving the accuracy of the
> point-to analysis could be decoupled and no longer had to be done in
> lock-step. The dynamic nature of haskell programs also played a part,
> when looking at a histogram of the number of targets a given heap
> location could evaluate to it was very tail-heavy, I found that by far,
> exactly 1 was the most common possibility. Furthermore, after a point,
> the size of the jump table in the cache killed the advantage of the
> direct call over an indirect one. So all in all, it made sense to
> concentrate on a couple cases, the direct case, when you can reduce the
> number of possibilities to just a couple choices, and the fallback case
> where an indirect call is used. Note that where the "sweet spot" lies
> for optimal performance in terms of direct vs indirect execution is
> rather CPU architecture dependent.
> http://www.complang.tuwien.ac.at/forth/threading/
> contains some test code for testing various threading methods in a forth
> implementation, which is somewhat relevant at least in demonstrating how
> different CPUs behave.

What about the missed optimization opportunities? In the GRIN paper,
Boquist shows how 'foldl (+) 0 [1..n]' could be compiled to a strict
loop with no allocations. Without points-to analysis (which seems to
be completely disable at this point) most of the important
optimizations are blocked and the resulting program is far less
efficient than it should be.


More information about the jhc mailing list