Removing Hoopl dependency?

Mon Jun 12 07:06:58 UTC 2017

Interesting!

Maybe there are a couple of different alternatives:

A.      A rewrite of Hoopl, with all the same basic ideas and data structures, but with a better API (I’m not sure exactly in what way, but Michael has some idea, as does Sophie), and a more efficient implementation.

B.      A more radical change to use hypergraphs, type-level lists etc.   This sounds interesting, but it’s a more substantial change and before using it for GHC we’d need to discuss the new proposed API in some detail

There’s no reason we couldn’t do (A) and (B) in parallel.

Michael is suggesting doing (A) in GHC’s tree, but with a clearly-declared intent to bring it out as a separate library.   (I’d advocate making it a separate library in GHC’s tree; we already have a number of those.

That would leave Sophie free to do (B) free of the constraints of GHC depending on it; but we could always use it later.

Does that sound plausible?  Do we know of any other Hoopl users?

Simon

From: Sophie Taylor [mailto:sophie at traumapony.org]
Sent: 11 June 2017 14:09
To: Michal Terepeta <michal.terepeta at gmail.com>; Simon Peyton Jones <simonpj at microsoft.com>; ghc-devs <ghc-devs at haskell.org>
Cc: Kavon Farvardin <kavon at cs.uchicago.edu>
Subject: Re: Removing Hoopl dependency?

Hello, fellow workers!

So, I'll pop in here with my thoughts.

I'm writing an independent intermediate language library for functional languages, and I looked at using Hoopl. I would use it, but there are several reasons why I'm not currently doing so:

1) Combining facts from different domains through fancy lattice algorithms. This is fairly straightforward to add to Hoopl with minimal extra API change.

2) I wanted to write my data facts as a type-level list, `freer-effects` style, in order to be more explicit in my types about dependencies between analyses. This would require significantly altering the API.

3) Its own custom graph code. This is the biggest reason why I decided not to. Some problems:
  * It seems impossible to change the topology of the graph in a rewriting step.
  * I wanted to use term hypergraphs/hyperjungles due to some pretty nifty properties
  * The intermediate language I'm implementing, a derivative of Graph Reduction Intermediate Notation, aka GRIN from UHC, is, as its name implies, intrinsically graph-based. Thus, graph manipulation has to be pretty easy to do.

So instead, I've decided to optimise another hypergraph library (`graph-rewriting` - I'm going to be rewriting it to use an inductive representation a la FGL)  and implement a generic, Hoopl-esque analysis library on top of that. (Or more accurately, that is my plan for the next six months - I've been sidetracked getting parsing to work nice with an effect-based stack!)

So, if Hoopl2 does become a thing, I'd be very keen on working on it, but if I were to actually use it myself, it'd probably require a complete rewrite. Fortunately, it's a pretty small library; and for GHC, its current usage is a pretty straightforward usecase which shouldn't be affected too much. That being said, if GHC were to better use Hoopl (e.g. moving some of the optimisations on Core to be Hoopl-based passes) then it would be a different story.

So I guess I'm volunteering to do the rewrite for a potential Hoopl2 if it's wanted, as I'm about to do pretty much that anyway.

Cheers,
Sophie

On Fri, 9 Jun 2017 at 22:31 Michal Terepeta <michal.terepeta at gmail.com<mailto:michal.terepeta at gmail.com>> wrote:
> On Fri, Jun 9, 2017 at 9:50 AM Simon Peyton Jones <simonpj at microsoft.com<mailto:simonpj at microsoft.com>> wrote:
> > Maybe this is the core of our disagreement - why is it a good idea to have Hoopl as a separate package in the first place?
>
>
> One reason only: because it makes Hoopl usable by compilers other than GHC.  And, dually, efforts by others to improve Hoopl will benefit GHC.
>
> > If I proposed extracting parts of Core optimizer to a separate package, wouldn't you expect some really good reasons for doing this?
>
>
> A re-usable library should be
> a)      a significant chunk of code,
> b)      that can plausibly be re-purposed by others
> c)      and that has an explicable API
>
> I think the Core optimiser is so big, and so GHC specific, that (b) and (c) are unlikely to hold.  But we carefully designed Hoopl from the ground up so that it was agnostic about the node types, and so can be re-used for control flow graphs of many kinds.  It’s designed to be re-usable.  Whether it is actually re-used is another matter, of course.  But if it’s part of GHC, it can’t be.

I agree with your characterization of a re-usable library and that
Core optimizer would not be a good fit. But I do think that Hoopl also
has some problems with b) and c) (although smaller):
- Using an optimizer-as-a-library is not really common (I'm not aware
  of any compilers doing this, LLVM is to some degree close but it
  exposes the whole language as the interface so it's closer to the
  idea of extracting the whole Cmm backend). So I don't think the API
  for such a project is well understood.
- The API is pretty wide and does put serious constraints on the IR
  (after all it defines blocks and graphs), making reusability
  potentially more tricky.

So I think I understand your argument and we just disagree on whether
this is worth the effort of having a separate package.

>
> [...]
>
> > I've pointed multiple reasons why I think it has a significant cost.
>
> Can you just summarise them again briefly for me?  If we are free to choose nomenclature and API for hoopl2, I’m not yet seeing why making it a separate package is harder than not doing so. E.g. template-haskell is a separate package.

Having even Hoopl2 as a separate package would still entail
additional work:
- Hoopl2 would still need to duplicate some concepts (eg, `Unique`,
  etc. since it needs to be standalone)
- Understanding code (esp. by newcommers) would be harder: the Cmm
  backend would be split between GHC and Hoopl2, with the latter
  necessarily being far more general/polymorphic than needed by GHC.
- Getting the right performance in the presence of all this additional
  generality/polymorphism will likely require fair amount of
  additional work.
- If Hoopl2 is used by other compilers, then we need to be more
  careful changing anything in incompatible ways, this will require
  more discussions & release coordination.

Considering that Hoopl was never actually picked up by other
compilers, I'm not convinced that this cost is justified. But I
understand that other people might have a different opinion.
So how about a compromise:
- decouple GHC from the current Hoopl (ie, go ahead with my diff),
- keep everything Hoopl related only in `compiler/cmm/Hoopl` with the
  long-term intention of creating a separate package,
- experiment with and improve the code,
- once (if?) we're happy with the results, discuss what/how to
  extract to a separate package.
That gives us the freedom to try things out and see what works well
(I simply don't have ready solutions for anything, being able to
experiment is IMHO quite important). And once we reach the right
performance/representation/abstraction/API we can work on extracting
that.

What do you think?

Cheers,
Michal

_______________________________________________
ghc-devs mailing list
ghc-devs at haskell.org<mailto:ghc-devs at haskell.org>
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail.haskell.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fghc-devs&data=02%7C01%7Csimonpj%40microsoft.com%7Cd747eec3caa74856abe408d4b0cb1b80%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636327833778402907&sdata=XF%2FDDgrIvni6kMJQg0ubJXtVtfXUp1HLifUBz2RTxJ4%3D&reserved=0>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20170612/cf798f7b/attachment-0001.html>