Moving Haddock *development* out of GHC tree

Simon Peyton Jones simonpj at microsoft.com
Fri Aug 8 07:48:35 UTC 2014


Mateusz

What you say makes sense to me.

For me, the big thing is that we can make, and push, changes to Haddock in the GHC private branch, without having to negotiate.  (Haddock reaches very deep into GHC's internals, so many many changes to GHC have some knock-on effect in Haddock.)  You seem OK with this, so I am too.

One concern: if you and Simon pay no attention to the GHC HEAD fork of Haddock, there is no guarantee that it works at all.  Presumably it compiles (because GHC's build system will build it, forcing us to fix type errors) but it might not actually work!  So it would probably pay for you to watch what is happening, to ensure that the patch-ups that ignorant GHC developers apply to Haddock do indeed have the desired effect.  

Some of these patch-ups might even be panics --- "I don't know how to make Haddock render new construct <foobar>".  That might be quite reasonable.

But in general, thumbs up from me

Simon

| -----Original Message-----
| From: ghc-devs [mailto:ghc-devs-bounces at haskell.org] On Behalf Of
| Mateusz Kowalczyk
| Sent: 08 August 2014 06:25
| To: ghc-devs at haskell.org
| Cc: Simon Hengel
| Subject: Moving Haddock *development* out of GHC tree
| 
| Hello,
| 
| A slightly long e-mail but I ask that you voice your opinion if you
| ever changed GHC API. You can skim over the details, simply know that
| it saves me vast amount of time, allows me to try and find contributors
| and doesn't impact GHC negatively. It seems like a win-win scenario for
| GHC and Haddock. GHC team's workflow does not change and will not
| require any new commitment: I do all the work and I think it's a 1 line
| change in sync-all when transition is ready. Here it is:
| 
| 
| It is no secret that many core Haskell projects lack developer hands
| and Haddock is no exception: the current maintainers are Simon Hengel
| and myself. Simon does not have much time so currently all the issues
| and updates are up to me. Ideally I would like if some more people
| could come and hack on Haddock but there are a couple of problems with
| trying to recruit folk for this:
| 
| 1. Interacting with GHC API is not the easiest thing. This is Haddock's
| problem but I thought I'd mention it here.
| 
| 2. Haddock resides directly in the GHC tree and it is currently
| *required* that it compiles with GHC HEAD. This is a huge barrier of
| entry for anyone: today I wanted to make a fairly simple change but it
| still took me 3 validate runs to be at least somewhat confident that I
| didn't break much in GHC. On top of this I had help from Edward Z. Yang
| on IRC and information from him on what the issue exactly was. If I was
| to do everything alone it would have taken even more validates. A
| validate is not fast on machine by any means, it takes an hour or two.
| 
| Here is what I want to do unless there are major objections: I want to
| move the active development away from GHC tree. Below is how it would
| work. For simplicity please imagine that we have *just* released 7.8.3.
| 
| * Haddock development would concentrate on supporting the last public
| release of GHC: I stop developing against GHC HEAD and currently would
| develop against 7.8.3.
| 
| * GHC itself checks out Haddock as a submodule as it does now. The only
| difference is that it points at whatever commit worked last. Let us
| assume it is the Haddock 2.14.3 release commit. The vital difference
| from current state is that GHC will no longer track changes in master
| branch.
| 
| * Now when GHC API changes things proceed as they normally do: whoever
| is responsible for the changes, pops into the Haddock submodule applies
| the patches necessary for Haddock to build with HEAD and everyone is
| happy. What does *not* happen is these patches don't go into master: I
| ignore them and keep working with 7.8.3.
| 
| * When a GHC release rolls around, I update Haddock to work with the
| new API so that people with new release can still use it. Once it works
| against new API, GHC can start tracking from that commit onwards and
| proceed as usual.
| 
| Here are the advantages:
| 
| * I don't have to work against GHC HEAD. This means I don't have to
| build GHC HEAD and I don't need to worry about GHC API changes. I don't
| waste 2-4 hours building before hacking and validating after hacking to
| make any minor changes and to make sure I haven't broken anything.
| 
| * More importantly, anyone who wants to write a patch for Haddock can
| now do so easily, all they need is recent compiler rather than being
| forced to build HEAD. Building and validating against HEAD is a
| **huge** barrier of entry.
| 
| * I only have to care about GHC API changes once a release and not
| twice a week. I think PatternSynonyms have changed 4 times in a month
| but the end result at release time is the same and that's what people
| care about.
| 
| * It is less work for anyone changing GHC API: they only have to deal
| with their own changes and not my changes which add features or
| whatever.
| 
| * If I break something in Haddock HEAD, GHC is not affected.
| 
| * If Haddock's binary interface doesn't change, we may even allow more
| versions of GHC be compatible through CPP and other such trickery. If
| we were to do it today, it would be an increased burden on the GHC team
| to deal with those.
| 
| * I can release as often as I want against the same compiler version.
| Currently doing this requires backporting features (see v2.14 branch)
| which is a massive pain. I no longer have to tell the users 'yes, your
| bug is fixed but to get it you need to compile GHC HEAD or wait 6-12
| months until next GHC release'. I have to do this a lot.
| 
| Here are the disadvantages and why I think they don't make a big
| difference:
| 
| * GHC HEAD doesn't get any new-and-cool features that we might
| implement. I say this doesn't matter because no one uses varying GHC
| HEAD versions to develop actual software, documentation and all. What I
| mean to say is that the only user of the Haddock that's developed in
| GHC tree is GHC itself. The only case where GHC actually used in-tree
| Haddock was when Herbert generated documentation for base-4.7 early for
| me to eye before the release. Even this doesn't matter because so close
| to the release I'll already have the existing GHC API integrated
| anyway.
| Again, it does not matter if GHC HEAD itself doesn't get pretty
| operator rendering or whatever right when I implement it because no one
| cares about it until it's release time. I know that many people simply
| HADDOCK_DOCS=NO to save time. The actual users only care about Haddock
| that works with 7.6.x, 7.8.x, 7.10.x; only GHC cares about in-betweens
| and only for the purpose of being able to build and validate.
| 
| * GHC team can't easily contribute features and get the back
| immediately. In part it doesn't matter because of the previous point
| and in the last year or so there were no features contributed directly
| from GHC except those necessary to keep Haddock compiling. This just
| means there's no demand for such close relationship.
| 
| * Haddock-affecting changes in GHC parser don't 'take effect' straight
| away. This is my loss and considering the infrequency at which such
| changes happen, it's a tiny price to pay to have to wait until release.
| 
| * ...that's it, no other disadvantages that I can think of, but that's
| why I'm sending it to the list to review!
| 
| What's worth mentioning is that the no-external-dependencies thing
| still applies because even though we no longer need to compile against
| HEAD, we still need to compile against the tree at release time.
| 
| In summary:
| 
| My life gets easier because I stop wasting it on playing with whole GHC
| tree, GHC team's life gets easier because they don't have to deal with
| the changes I make. My life gets even easier because I only have to
| make big API updates once a release. I can actually start looking for
| contributors.
| 
| When a release rolls around, GHC and Haddock 'meet up', we make sure it
| all works, release happens, GHC starts tracking from that point and we
| part ways until the next release.
| 
| What do you think? If there are no major objections in one week then I
| will assume I am good to go with this.
| 
| Transition from current setup:
| If I receive some patches I was promised then I will then make a 2.14.4
| bugfix/compat release make sure that master is up to date and then
| create something like GHC-tracking branch from master and track that. I
| will then abandon that branch and not push to it unless it is GHC
| release time. The next commit in master will bring Haddock to a state
| where it works with 7.8.3: yes, this means removing all new API stuff
| until 7.10 or 7.8.4 or whatever. GHC API changes go onto GHC-tracking
| while all the stuff I write goes master. When GHC makes a release or is
| about to, I make master work with that and make GHC-tracking point to
| that instead.
| 
| 
| Thanks!
| --
| Mateusz K.
| _______________________________________________
| ghc-devs mailing list
| ghc-devs at haskell.org
| http://www.haskell.org/mailman/listinfo/ghc-devs


More information about the ghc-devs mailing list