Syntax extensions (was: RE: The Future of Haskell discussion at the Haskell Workshop)

Thu Sep 11 12:46:46 EDT 2003

Mark Jones writes:=20
> As a solution to that problem, the many-command-line-options
> scheme described seems quite poor!  It's far too tool specific,
> not particularly scalable, and somewhat troublesome from a software
> engineering perspective.  We're not talking about a choice between
> two points any more; there's a whole lattice of options, which, by
> the proposal above might be controlled through a slew of tool-specific
> and either cryptic or verbose command line switches.  Will you
> remember which switches you need to give to compile your code for
> the first time in two months?  How easy will it be to translate
> those settings if you want to run your code through a different
> compiler?  How much help will the compiler give you in tracking
> down a problem if you forget to include all the necessary switches?
> And how will you figure out what options you need to use when you
> try to combine code from library X with code from library Y, each
> of which uses its own interesting slice through the feature set?
>=20
> I know that some of these problems can be addressed, at least in
> part, by careful use of Makefiles, {-# custom pragmas #-}, and perhaps
> by committing to a single tool solution.  But I'd like to propose
> a new approach that eliminates some of the command line complexities
> by integrating the selection of language extensions more tightly
> with the rest of the language.

Initially I liked the idea, but now I'm not so sure (more about that
later). But first I'll point out that the situation isn't nearly as bad
as you make out.  In GHC, the approved way to add these flags is by
using a pragma to the source code, for example:

  {-# OPTIONS -fth -fffi #-}
  module Foo where
  ...

this in itself addresses most of your complaints.  Using a
compiler-independent syntax would address another one.  We're left with:

> How much help will the compiler give you in tracking
> down a problem if you forget to include all the necessary
> switches?

We don't make any attempt to address this, but there are various ways we
could be more helpful: eg. finding a stray 'forall' in a type when
rank-N is not turned on is a clear indication.  Nevertheless, this seems
orthogonal to the issue of how to specify the language flavour in the
first place.

> And how will you figure out what options you need to use when you
> try to combine code from library X with code from library Y, each
> of which uses its own interesting slice through the feature set?

Interesting point.  Our take on this is that the extension-flags specify
the language variant in which the source code *in this module* is
written.  For example, if I define a multi-parameter type class C in
module A, then it is completely legal to import A into any other module
regardless of the settings of the flags, but I will only be able to
write an instance of C if multi-parameter type classes are enabled.

This is a consistent position which has the benefit of being easy to
understand.

> The main idea is to use the module system to capture information
> about which language features are needed in a particular program.
> For example, if you have a module that needs implicit parameters
> Template Haskell, and TREX, then you'll indicate this by including
> something like the following imports at the top of your code:
>=20
>   import Extensions.Types.ImplicitParams
>   import Extensions.Language.TemplateHaskell
>   import Extensions.Records.TREX

How do I enable hierarchical modules using this scheme? ;-)

Can any of these extensions affect the syntax of the export list?  If
so, how do I parse the module?  (perhaps I have to use a most-general
syntax for the export list, parse up to and including the imports, then
re-parse the export list).

[snip]
> Of course, code that does:
>=20
>   import Extensions.Types.Multiparam
>=20
> is not standard Haskell 98 because there's no such library in the
> standard.  This is a good thing; our code is clearly annotated as
> relying on a particular extension, without relying on the command
> line syntax for a particular tool.  Moreover, if the implementers
> of different tools can agree on the names they use, then code that
> imports Extensions.Types.Multiparam will work on any compiler that
> supports multiple parameter classes, even if the underlying
> mechanisms for enabling/disabling those features are different.
> When somebody tries to compile that same piece of code using a
> tool that doesn't support the feature, they'll get an error
> message about a missing import with a (hopefully) suggestive
> name/description, instead of a cryptic "Syntax error in constraint"
> or similar.

This complaint is also addressed by using a compiler-independent pragma,
except the error message would be "unsupported extension".

> Also, when you come back to compile your code after some
> time away, you won't need to remember which command line options you
> need because it's all there, built in to the source in a readable and
> perhaps even portable notation. You just invoke the compiler (without
> worrying about specifying options) and it does the rest!

Also applies to pragmas.

> Hmm, ok, but perhaps you're worrying now about having to enumerate
> a verbose list of language features at the top of each module you
> write.  Isn't that going to detract from readability?  This is where
> the module system wins big!  Just define a new module that imports all
> the features you need, and then allows you to access them by a single
> name.  For example, you could capture the second feature set above
> in the following:
>=20
>   module HackersDelight where
>   import Extensions.Language.Mdo
>   import Extensions.Records.Structs
>   import Extensions.Types.RankN
>   import Extensions.Types.Multiparam
>=20
> Now the only thing you have to write at the top of a module that
> needs some or all of these features is:
>=20
>   import HackersDelight

This seems like a compelling feature, and indeed when I started to write
this message I was going to say what a great idea it is.  But I've
thought about it a bit more, and concluded that I don't think it is
worth the implementation cost.  Bah!  you say, that's for the
implementors to worry about - and to a certain extent you'd be right,
but let me point out the difficulties that arise if you're allowed to do
this.

The pragma approach to specifying the language variant has a nice
property: I can parse any module into abstract syntax without knowing
anything about the semantics of imports.  Well, that's not entirely
true: to correctly implement Haskell 98 I have to know the fixities of
operators to parse a module, and fixities can be imported.  But none of
the existing Haskell 98 implementations actually does this.  Why?
Because it's too much of a pain.

If I have to understand the imports in order to parse a module, I can't
write a standalone Haskell parser.  A standalone parser would have to
chase imports, which brings in a large amount of extra complexity.

However, there are some extensions for which this approach works nicely:
one is overlapping instances.  If I have a library which exports an
interface which relies heavily on overlapping instances to work, then it
makes sense for all clients of the library to automatically get
overlapping-instances turned on.  Or does it?  Do I really want
overlapping instances turned on just because I imported module
Foo.Bar.Baz, or would I rather have to specify them explicitly in each
module that requires them?

I think I lean towards having to specify extensions explicitly in each
module, but there are clearly arguments both ways.

[ rest of message snipped ]

In conclusion, I'm not convinced.  Using pragmas in the source code
solves most of the problems, although admittedly we should agree on a
compiler-independent syntax for them.   Re-exporting extensions brings a
heavy implementation burden, and it is not clear (to me) that we really
want the ability to do this.

Cheers,
	Simon