GHC source code improvement ideas
simonpj at microsoft.com
Fri Jan 4 03:34:22 EST 2008
In general I'd be delighted for you to clean up GHC's code -- thank you! There is a slight down-side which is that because you end up touching a lot of code, you get a big patch that means you can't selectively pick patches from before and after the change -- but I'm not too bothered about that, and its balanced by the clean-up.
Let's also see what Simon and Ian (or others) have to say.
Best to do one change per patch.
A very useful thing to do would be to improve the Commentary:
In fiddling with GHC's codebase you will surely learn stuff that you wish you'd known to start with, and that is the Ideal Moment to capture it in writing on the Wiki. A real service to others.
| 1a. Use do notation where possible instead of `thenXX`.
| 1b. Even better, use Applicative where possible. For example
| ds_type (HsFunTy ty1 ty2)
| = dsHsType ty1 `thenM` \ tau_ty1 ->
| dsHsType ty2 `thenM` \ tau_ty2 ->
| returnM (mkFunTy tau_ty1 tau_ty2)
| Could be rewritten as
| ds_type (HsFunTy ty1 ty2)
| = mkFunTy <$> dsHsType ty1 <*> dsHsType ty2
This works great when you have the pattern above -- but often it's a bit more complicated and then it does not work so well. So it's hard to employ the pattern consistently. Then you have to ask whether a mixture of styles is good.
My gut feel is that it's better to use do-notation consistently.
| 2. Investigate the need for all these mappM functions. To quote the
| source: "Standard combinators, but specialised for this monad (for
| efficiency)". Wouldn't a SPECIALIZE pragma work just as well?
The trouble is that currently you can only specialise a function at its definition site -- and at the definition of mapM the relevant monad isn't in scope because mapM is in a library.
I have no clue whether this has any effect on performance. You could try a small experiment, by defining mappM=mapM and seeing if there was any change.
| 3. In general, use standard functions when they are available. Currently
| GHC has its own sort function and its own versions of FiniteMap/Data.Map and
| Text.PrettyPrint. Using standard functions/data structures will a) make GHC
| smaller and therefore easier to maintain, and b) will make the code easier
| to understand for someone who knows the standard library.
In general yes.
But GHC is meant to be compilable with any GHC back to 6.2. So if the interface to a library changes between (say) 6.2 and 6.4, you end up with ifdefs. That can be a royal pain. Having the library code in the compiler source eliminates that problem.
The other thing is that it's possible that GHC's version of the library has more functions. Or uses some non-standard feature like unboxed types.
So proceed with caution here. Prettyprinting would be a strong candidate. FiniteMap too. But be careful with UniqFM -- it's a very very heavily-used library.
| 4. A more radical change would be introducing hierarchical modules. This
| could be just a matter of renaming the directories to start with an upper
| case character and changing the import declarations. This gives module names
| like "Typecheck.TcGadt". The tc is redundant here, so perhaps it should be
| renamed to "Typecheck.Gadt" or "Typecheck.GADT". Perhaps even better would
| be "GHC.Typecheck.GADT", this way some modules can be exposed as part of the
| GHC api.
I don't think I'd object, but I'm not sure that much is gained here. We don't spend much time on mondule-name clashes.
More information about the Glasgow-haskell-users