Syntax extensions (was: RE: The Future of Haskell discussion at the Haskell Workshop)

Thu, 11 Sep 2003 01:36:51 -0700

| We at GHC HQ agree, and for future extensions we'll move to 
| using separate options to enable them rather than lumping 
| everything into -fglasgow-exts.  This is starting to happen 
| already: we have -farrows, -fwith, -fffi (currently implied 
| by -fglasgow-exts).
| 
| Of course, if we change the language that is implied by 
| -fglasgow-exts now, we risk breaking old code :-)  Would folk 
| prefer existing syntax extensions be moved into their own 
| flags, or left in -fglasgow-exts for now?  I'm thinking of:
| 
|   - implicit parameters
|   - template haskell
|   - FFI
|   - rank-N polymorphism (forall keyword)
|   - recursive 'do' (mdo keyword)

Haskell gets pulled in many different directions to meet the needs
and whims of developers, researchers, and educators, among others.
For quite a long time, it seemed that the choice between "Standard
Haskell 98" and "Kitchen Sink Haskell with all the extras" was
adequately dealt with using a single command line option.  Those
looking for the stability of Haskell 98 got what they wanted by
default, while the adventurers looking to play with all the new
toys just added an extra "-fglasgow-exts" or "-98" or ... etc.

As the number of extensions grows (and the potential for unexpected
interactions), it is clear that we can't get by with that simple
scheme any more.  It's important that implementations continue to
provide the stable foundation, but people also need a more flexible
way to select extensions when they need them.

As a solution to that problem, the many-command-line-options
scheme described seems quite poor!  It's far too tool specific,
not particularly scalable, and somewhat troublesome from a software
engineering perspective.  We're not talking about a choice between
two points any more; there's a whole lattice of options, which, by
the proposal above might be controlled through a slew of tool-specific
and either cryptic or verbose command line switches.  Will you
remember which switches you need to give to compile your code for
the first time in two months?  How easy will it be to translate
those settings if you want to run your code through a different
compiler?  How much help will the compiler give you in tracking
down a problem if you forget to include all the necessary switches?
And how will you figure out what options you need to use when you
try to combine code from library X with code from library Y, each
of which uses its own interesting slice through the feature set?

I know that some of these problems can be addressed, at least in
part, by careful use of Makefiles, {-# custom pragmas #-}, and perhaps
by committing to a single tool solution.  But I'd like to propose
a new approach that eliminates some of the command line complexities
by integrating the selection of language extensions more tightly
with the rest of the language.

The main idea is to use the module system to capture information
about which language features are needed in a particular program.
For example, if you have a module that needs implicit parameters
Template Haskell, and TREX, then you'll indicate this by including
something like the following imports at the top of your code:

  import Extensions.Types.ImplicitParams
  import Extensions.Language.TemplateHaskell
  import Extensions.Records.TREX

Code that needs recursive do, O'Haskell style structs, rank-n
polymorphism, and multiple parameter classes might specify:

  import Extensions.Language.Mdo
  import Extensions.Records.Structs
  import Extensions.Types.RankN
  import Extensions.Types.Multiparam

Imports are always at the top of a module, so they're easy to
find, and so provide clear, accessible documentation.  (Don't
worry about the names I've picked here; they're intended to
suggest possibilities, but they're not part of the proposal.)

What, exactly is in those modules?  Perhaps they just provide
tool-specific pragmas that enable/disable the corresponding
features.  Or perhaps the compiler detects attempts to import
particular module names and instead toggles internal flags.
But that's just an implementation detail: it matters only to the
people who write the compiler, and not the people who use it.
It's the old computer science trick: an extra level of indirection,
in this case through the module system, that helps to decouple
details that matter to Haskell programmers from details that
matter to Haskell implementers.

Of course, code that does:

  import Extensions.Types.Multiparam

is not standard Haskell 98 because there's no such library in the
standard.  This is a good thing; our code is clearly annotated as
relying on a particular extension, without relying on the command
line syntax for a particular tool.  Moreover, if the implementers
of different tools can agree on the names they use, then code that
imports Extensions.Types.Multiparam will work on any compiler that
supports multiple parameter classes, even if the underlying
mechanisms for enabling/disabling those features are different.
When somebody tries to compile that same piece of code using a
tool that doesn't support the feature, they'll get an error
message about a missing import with a (hopefully) suggestive
name/description, instead of a cryptic "Syntax error in constraint"
or similar.  Also, when you come back to compile your code after some
time away, you won't need to remember which command line options you
need because it's all there, built in to the source in a readable and
perhaps even portable notation. You just invoke the compiler (without
worrying about specifying options) and it does the rest!

Hmm, ok, but perhaps you're worrying now about having to enumerate
a verbose list of language features at the top of each module you
write.  Isn't that going to detract from readability?  This is where
the module system wins big!  Just define a new module that imports all
the features you need, and then allows you to access them by a single
name.  For example, you could capture the second feature set above
in the following:

  module HackersDelight where
  import Extensions.Language.Mdo
  import Extensions.Records.Structs
  import Extensions.Types.RankN
  import Extensions.Types.Multiparam

Now the only thing you have to write at the top of a module that
needs some or all of these features is:

  import HackersDelight

Need to make use of code that comes from different sources?  No
problem!  If you write:

  import HackersDelight
  import AnotherSetofFeatures

then, through the semantics of import, you automatically get the
combination of both feature sets as the union of the two pieces.
(In the event that you have somehow requested incompatible features,
you'd get an error to help diagnose the problem.)

Who knows, one day some of these "feature" modules might be written
in Template Haskell, or using some as-yet-unknown mechanism for
syntactic extension (e.g., like the macro system in bigwig).
If/when that happens, it will all fit quite neatly into the scheme
I've described here.

In the meantime, I think the approach that I've described here is
quite elegant, readily extensible, certainly flexible, and potentially
quite portable.  I hope you'll all like it!  :-)

All the best,
Mark