GHC source code improvement ideas
Simon Marlow
simonmarhaskell at gmail.com
Fri Jan 4 09:06:33 EST 2008
Twan van Laarhoven wrote:
> Hello GHC people,
>
> I was trying my hand at some GHC hacking, and I couldn't help but notice
> that much of the code looks (IMHO) horrible. There are some things that
> look as if they haven't been touched since the Haskell 1.3 days. The
> most striking is the use of `thenXX` instead of do notation.
>
> I was thinking of cleaning up some of this. In particular:
>
> 1a. Use do notation where possible instead of `thenXX`.
Yes.
> 1b. Even better, use Applicative where possible. For example
>
> ds_type (HsFunTy ty1 ty2)
> = dsHsType ty1 `thenM` \ tau_ty1 ->
> dsHsType ty2 `thenM` \ tau_ty2 ->
> returnM (mkFunTy tau_ty1 tau_ty2)
>
> Could be rewritten as
>
> ds_type (HsFunTy ty1 ty2)
> = mkFunTy <$> dsHsType ty1 <*> dsHsType ty2
No, IMO. This just adds another abstraction that a potential GHC
contributor has to learn about. Including me :-) It doesn't hurt to write
out the code in full sometimes, and as the GHC coding style guidelines say:
Remember: other people have to be able to quickly understand what you've
done, and overuse of abstractions just serves to obscure the really
tricky stuff, and there's no shortage of that in GHC.
http://hackage.haskell.org/trac/ghc/wiki/Commentary/CodingStyle
(I think I might be quoting myself there, in which case it doesn't really
add much credibility to my argument :).
Also bear in mind that we have to be able to compile GHC with older
versions of itself (currently back to 6.2.2), which explains why we're
often conservative with respect to using external libraries.
> 2. Investigate the need for all these mappM functions. To quote the
> source: "Standard combinators, but specialised for this monad (for
> efficiency)". Wouldn't a SPECIALIZE pragma work just as well?
It does look like we could do away with some of those, yes.
> 3. In general, use standard functions when they are available.
> Currently GHC has its own sort function and its own versions of
> FiniteMap/Data.Map and Text.PrettyPrint. Using standard functions/data
> structures will a) make GHC smaller and therefore easier to maintain,
> and b) will make the code easier to understand for someone who knows the
> standard library.
In general yes. But bear in mind there's another principle at work here:
we want to insulate GHC as much as possible from its environment, to reduce
the chances of spurious failures or performance regressions.
I believe our version of Text.PrettyPrint has diverged from the library
version - last time I looked it wasn't obvious that we could replace it.
Regarding Data.Map, I'd be interested in trying out AVL trees instead, to
see if they have better performance, but then we'd have to import the code
of course.
> 4. A more radical change would be introducing hierarchical modules.
> This could be just a matter of renaming the directories to start with an
> upper case character and changing the import declarations. This gives
> module names like "Typecheck.TcGadt". The tc is redundant here, so
> perhaps it should be renamed to "Typecheck.Gadt" or "Typecheck.GADT".
> Perhaps even better would be "GHC.Typecheck.GADT", this way some modules
> can be exposed as part of the GHC api.
Yes, I've wondered about doing this in the past. It does mean more typing,
though: clearly 'import BasicTypes.Id' is more painful than just 'import
Id' and doesn't add much, so IMO we'd need to put some thought into a
structure that makes sense.
> How do the GHC developers feel about this? Would such paches be gladly
> excepted? Or will they be directly forwarded to the bit bucket?
Within the constraints I've outlined above, cleanups are definitely welcome.
Another worthwhile gardening-type activity is to go around elimianting
warnings. Many modules have {-# OPTIONS -w #-} at the top, waiting for
someone to go in and clean up all the warnings. They do catch real bugs,
so this is not a waste of time by any means.
In addition to this, things like
- identifying and removing dead code
- commoning up duplicate code fragments
- improving test coverage
- profiling and optimising hotspots
are all useful tasks.
Cheers,
Simon
More information about the Glasgow-haskell-users
mailing list