CMM-to-ASM: Register allocation wierdness

Sun Jun 19 15:59:48 UTC 2016

Agreed. There's also some other mismatches between ghc and llvm in a few
fun / interesting ways!

There's a lot of room for improvement in both code gens, but there's also a
lot of room to improve the ease of experimenting with improvements.  Eg we
don't have a peephole pass per target, so those get hacked into the pretty
printing code last time I checked

On Thursday, June 16, 2016, Ben Gamari <ben at smart-cactus.org
<javascript:_e(%7B%7D,'cvml','ben at smart-cactus.org');>> wrote:

>
> Ccing David Spitzenberg, who has thought about proc-point splitting, which
> is relevant for reasons that we will see below.
>
>
> Harendra Kumar <harendra.kumar at gmail.com> writes:
>
> > On 16 June 2016 at 13:59, Ben Gamari <ben at smart-cactus.org> wrote:
> >>
> >> It actually came to my attention while researching this that the
> >> -fregs-graph flag is currently silently ignored [2]. Unfortunately this
> >> means you'll need to build a new compiler if you want to try using it.
> >
> > Yes I did try -fregs-graph and -fregs-iterative both. To debug why
> nothing
> > changed I had to compare the executables produced with and without the
> > flags and found them identical.  A note in the manual could have saved me
> > some time since that's the first place to go for help. I was wondering
> if I
> > am making a mistake in the build and if it is not being rebuilt
> > properly. Your note confirms my observation, it indeed does not change
> > anything.
> >
> Indeed; I've opened D2335 [1] to reenable -fregs-graph and add an
> appropriate note to the users guide.
>
> >> All-in-all, the graph coloring allocator is in great need of some love;
> >> Harendra, perhaps you'd like to have a try at dusting it off and perhaps
> >> look into why it regresses in compiler performance? It would be great if
> >> we could use it by default.
> >
> > Yes, I can try that. In fact I was going in that direction and then
> stopped
> > to look at what llvm does. llvm gave me impressive results in some cases
> > though not so great in others. I compared the code generated by llvm and
> it
> > perhaps did a better job in theory (used fewer instructions) but due to
> > more spilling the end result was pretty similar.
> >
> For the record, I have also struggled with register spilling issues in
> the past. See, for instance, #10012, which describes a behavior which
> arises from the C-- sinking pass's unwillingness to duplicate code
> across branches. While in general it's good to avoid the code bloat that
> this duplication implies, in the case shown in that ticket duplicating
> the computation would be significantly less code than the bloat from
> spilling the needed results.
>
> > But I found a few interesting optimizations that llvm did. For example,
> > there was a heap adjustment and check in the looping path which was
> > redundant and was readjusted in the loop itself without use. LLVM either
> > removed the redundant  _adjustments_ in the loop or moved them out of the
> > loop. But it did not remove the corresponding heap _checks_. That makes
> me
> > wonder if the redundant heap checks can also be moved or removed. If we
> can
> > do some sort of loop analysis at the CMM level itself and avoid or remove
> > the redundant heap adjustments as well as checks or at least float them
> out
> > of the cycle wherever possible. That sort of optimization can make a
> > significant difference to my case at least. Since we are explicitly aware
> > of the heap at the CMM level there may be an opportunity to do better
> than
> > llvm if we optimize the generated CMM or the generation of CMM itself.
> >
> Very interesting, thanks for writing this down! Indeed if these checks
> really are redundant then we should try to avoid them. Do you have any
> code you could share that demosntrates this?
>
> It would be great to open Trac tickets to track some of the optimization
> opportunities that you noted we may be missing. Trac tickets are far
> easier to track over longer durations than mailing list conversations,
> which tend to get lost in the noise after a few weeks pass.
>
> > A thought that came to my mind was whether we should focus on getting
> > better code out of the llvm backend or the native code generator. LLVM
> > seems pretty good at the specialized task of code generation and low
> level
> > optimization, it is well funded, widely used and has a big community
> > support. That allows us to leverage that huge effort and take advantage
> of
> > the new developments. Does it make sense to outsource the code generation
> > and low level optimization tasks to llvm and ghc focussing on higher
> level
> > optimizations which are harder to do at the llvm level? What are the
> > downsides of using llvm exclusively in future?
> >
>
> There is indeed a question of where we wish to focus our optimization
> efforts. However, I think using LLVM exclusively would be a mistake.
> LLVM is a rather large dependency that has in the past been rather
> difficult to track (this is why we now only target one LLVM release in a
> given GHC release). Moreover, it's significantly slower than our
> existing native code generator. There are a number of reasons for this,
> some of which are fixable. For instance, we currently make no effort to
> tell
> LLVM which passes are worth running and which we've handled; this is
> something which should be fixed but will require a rather significant
> investment by someone to determine how GHC's and LLVM's passes overlap,
> how they interact, and generally which are helpful (see GHC #11295).
>
> Furthermore, there are a few annoying impedance mismatches between Cmm
> and LLVM's representation. This can be seen in our treatment of proc
> points: when we need to take the address of a block within a function
> LLVM requires that we break the block into a separate procedure, hiding
> many potential optimizations from the optimizer. This was discussed
> further on this list earlier this year [2]. It would be great to
> eliminate proc-point splitting but doing so will almost certainly
> require cooperation from LLVM.
>
> Cheers,
>
> - Ben
>
>
> [1] https://phabricator.haskell.org/D2335
> [2] https://mail.haskell.org/pipermail/ghc-devs/2015-November/010535.html
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/glasgow-haskell-users/attachments/20160619/46aa556a/attachment.html>