Haddock strings in .hi files

Fri Mar 21 11:38:33 UTC 2014

Ok, I buy the argument that if we're already compiling everything, we 
shouldn't have to re-typecheck it all in Haddock. Of course if you're 
*not* already compiling everything, then the argument doesn't apply: 
Haddock does support generating documentation from source files without 
precompiling them, but I think if you ask the GHC API to load modules 
with -fno-code it should do the right thing: load up the .hi files if 
they're up to date, or typecheck the modules otherwise.

So I think having GHC spit out the docs as a side-effect of compilation 
is fine, so long as we don't have to do all the Haddock processing 
inside GHC itself, and provided this eliminates Haddock's own interface 
files (which are a pain).  If the docs go in the .hi file, then they 
must go in a separate section that is lazy parsed - we already do this 
for various other sections in the .hi file.

I don't think this is easy, but it's probably doable.  The code that 
attached docs to declarations is currently part of Haddock itself, so 
perhaps this has to move into GHC.

Cheers,
Simon

On 20/03/2014 16:41, Edward Kmett wrote:
> My knowledge of precisely how haddock works is somewhat fuzzy in that it
> arises from a series of discussions a couple of years back.
>
> My observation was mostly that I run 'cabal install' it goes through all
> the modules building my .hi files, etc. Then I run cabal haddock and it
> spends all that time redoing the same work, just to go through and get
> at some information that we had right up until the moment we finished
> building.
>
> I'm not wedded to bolting the information into the .hi files being the
> solution, but the idea that we could avoid redoing that work is
> tantalizing. I'm mostly trying to avoid redoing all the same work twice
> in the build cycle of the average user.
>
> If there is an alternative strategy, such as, oh, I don't know, making
> haddock able to hook in plugin-style late as we're generating the .hi
> file to spit out what it needs to something else and
> interrogate/rename/whatever it needs the rest of the GHC API I'd be
> totally open that as well.
>
> -Edward
>
>
> On Thu, Mar 20, 2014 at 12:18 PM, Mateusz Kowalczyk
> <fuuzetsu at fuuzetsu.co.uk <mailto:fuuzetsu at fuuzetsu.co.uk>> wrote:
>
>     On 20/03/14 16:08, Edward Kmett wrote:
>      > One strong reason for considering at least including the haddocks
>     in the
>      > .hi files is build times.
>      >
>      > Currently if you have cabal configured to build and document
>     every package
>      > running hackage requires you to recompile your entire source tree
>     a second
>      > time to get information that we just dropped on the floor before
>     spitting
>      > out the .hi file.
>      >
>      > For most of the users of GHC this is a 50% difference in compile
>     times if
>      > they have cabal configured to generate haddocks.
>      >
>      > GHC doesn't have to understand the haddocks any more than it does
>     now to
>      > support it, just include the content.
>      >
>      > Haddock could then just go through and load the .hi files rather than
>      > starting from scratch with parsing and typechecking the entire
>     module,
>      > running template-haskell, just to get at the documentation.
>      >
>      > Any pythonesque :doc command support to me would be gravy.
>      >
>      > The reason I care at all is the build times. I regularly lose
>     minutes out
>      > of each build just to regenerate docs and wind up skipping
>     building them as
>      > much as I can get away with to avoid he pain.
>      >
>      > -Edward
>      >
>      >
>
>     As Simon M points out, we still have to run the renamer which seems to
>     be tightly bound with the type-checker. Where do you suggest the
>     sizeable performance increase would be coming from in this case? For all
>     the existing packages, we already read the docs from .haddock files so
>     there's no difference there. For new packages we have to type-check and
>     generate .haddock anyway so there's no difference there either.
>
>     It's not really about GHC having to know more about Haddock, it's about
>     Haddock having to use GHC anyway, whether the strinsg are embedded
>     or not.
>
>     --
>     Mateusz K.
>
>