"dataflow rewriting engine"

Wed Aug 27 05:35:17 EDT 2008

Manuel M T Chakravarty wrote:
> Deborah Goldsmith:
>> Has there been any thought about working with the LLVM project? I 
>> didn't find anything on the wiki along those lines.
> 
> I have only had a rather brief look at LLVM, but my understanding at the 
> moment is that LLVM would not be able to support one of GHC's current 
> code layout optimisations.  More precisely, with LLVM, it would not be 
> possible to enforce that the meta data for a closure is placed right 
> before (in terms of layout in the address space) the code executing the 
> "eval" method of that same closure.  GHC uses that to have the closure 
> code pointer point directly to the "eval" code (and hence also by an 
> appropriate offset) to the various fields of the meta data.  If that 
> layout cannot be ensured, GHC needs to take one more indirection to 
> execute "evals" (which is a very frequent operation) - this is what an 
> unregistered build does btw.
> 
> However, I am not convinced that this layout optimisation is really 
> gaining that much extra performance these days.  In particular, since 
> dynamic pointer tagging, very short running "evals" (for which the extra 
> indirection incurs the largest overhead) have become less frequent.  
> Even if there is a slight performance regression, I think, it would be 
> worthwhile to consider giving up on the described layout constraint.  It 
> is the Last Quirk that keeps GHC from using standard compiler back-ends 
> (such as LLVM), and I suspect, it is not worth it anymore.
> 
> When we discussed this last, Simon Marlow planned to run benchmarks to 
> determine how much performance the layout optimisation gains us these 
> days.  Simon, did you ever get around to that?

I didn't get around to benchmarking it, but since the layout optimisation 
is easily switched off (it's called tablesNextToCode inside GHC) there's 
really nothing stopping someone from building a backend that doesn't rely 
on it.  Everything works without this optimisation, including GHCi, the 
debugger, and the FFI.

My guess is you'd pay a few percent on average for not doing it.  You're 
quite right that pointer tagging makes it less attractive, but like most 
optimisations there are programs that fall outside the common case. 
Programs that do a lot of thunk evals will suffer the most.

Cheers,
	Simon