Specifying language extensions

Thu Feb 2 19:11:31 EST 2006

One issue that pains me with Haskell 98 is that it does nothing about  
one of its
original stated goals as a programming language.

I've always been very fond of pointing out that Haskell has been  
designed as a
language for EXPERIMENTATION with language features, and therefore the
mind-boggling number of extensions in GHC is a good thing, not a bad  
one.

This will NOT change with Haskell' - we already know that at least  
one language
feature which is in high demand (functional dependencies) is far from  
ready for
standardization.

However, the language, at the current point, does nothing towards  
helping with
keeping track of the language extensions in use by a particular piece  
of code.

To make things worse, many extension require lexical and/or syntactic
changes to the language. Some extensions in the past proved to be  
incompatible,
and not all compilers implement all extensions.

So, looking at a particular piece of code, how do we know which compiler
it can be compiled with, and which command-line flags do we pass to  
which
compiler? The answer is, simply, we don't.

Currently, we do this by introducing ugly GHC-specific pragmas, comments
in the documentation, and a plethora of similar half-measures.

I would like to propose a language feature to address this concern  
once and
for all. I know that a similar proposal has been circulated on the  
haskell
mailing list in the past, but to the best of my knowledge, none of  
the current
tickets addresses this problem.

The proposal:

Add explicit syntax for documenting language extensions required by a  
module.

The Problem:

  * Current language does not provide a uniform extension mechanism.

  * Pragmas (as currently used in GHC) are NOT suitable for this.  
Specifically,
    by their design they are COMMENTS, and therefore should have no  
impact on
    the language semantics. Unrecognized pragmas should be ignorable  
without
    warnings, while, currently, if you omit the GHC options pragma  
for a module
    that requires some syntactic extension, you will receive  
thousands of lexical
    and syntactic errors completely unrelated to your extension.

  * I /strongly/ believe that compiler pragmas should ONLY be used for
    extensions (such as INLINE, various SPECIALIZE, etc.) that have  
no impact
    on the program semantics.

The Solution:

  Add an "extension" clause to the language.

  To avoid introduction of a new keyword, I propose to use the  
following syntax:

	module -> [extensions;] "module" modid [exports] "where" body
	       |  body

	extensions -> extension_1 ; ... ; extension_n

	extension -> "import" "extension" varid [extparam_1 ... extparam_n]

	extparam -> varid | conid | literal

  A module using some GHC woo-hah extension would look like:

	import extension GHC "woo-hah!"
	module Foo ...

  Or, if an extension is common enough:

	import extension fundeps
	module Foo ...

  This is a very conservative syntax. It does not support grouping,  
aliasing and
  renaming of extensions (as previously circulated on the haskell  
mailing list)
  which I personally believe would be very a bad idea. I want to be  
able to look
  at a piece of Haskell code and tell immediately which extensions it  
uses,
  without being forced to browse through the included modules, etc.

  Extensions would NOT be exported by a module that uses that  
extension, but would
  have to be specified separately by each module that uses the  
features provided
  by that extension. For example, one often hides uses of unboxed  
types, functional
  dependencies, etc, behind a curtain of abstract data types, and  
such data type
  implemented using non-standard features can be happily used within  
standard-conforming
  Haskell programs. On the other hand, if an extension is visible in  
an interface
  exported by a module, it has to be named explicitly (with "import  
extension" clauses)
  by any module importing that interface.

  Extensions could be parametized, since I can readily imagine  
extensions that
  would require such a thing. I would also recommend in the standard  
that every
  compiler groups its own extensions under a common name (for  
example, GHC, HUGS,
  JHC, NHC, etc.) until they are in sufficiently common use to be  
standardized
  independently (such as fundeps), at which stage there should  
probably be a
  corresponding addendum to the standard for that extension.

  Specifying extensions before the "module" keyword ensures that the  
lexer and
  parser can find them before parsing of the actual module. I  
recommend that
  bare modules without the "module" keyword cannot specify any  
extensions, and
  therefore must be written in pure Haskell'.

  The standard itself should not define any extensions and state that  
the
  standard grammar and semantics describes the base language in  
absence of
  any "import extension" clauses.

  Each extension, including FFI, should be described in a separate  
addendum.

Pros:

  * Addresses a pending language design goal.
  * Useful for automatic documentation tools such as Haddock, which  
could
    even generate a hyperlink from an extension name to the relevant
    addendum when available.
  * Simple to implement.
  * Neat syntax.
  * Backwards-compatible (introduces no keyword polution.)
  * Makes all information required to compile a program available  
directly
    in the source code, rather than in ad-hoc places such as command- 
line,
    Cabal package descriptions, documentation, comments, pragmas and  
what-not.
  * Returns all comments (including pragmas) to their original code as
    semantically-neutral annotations on the source program.

Cons:

  * Some implementation hassles. The compiler must use either a  
predicated
    parser and lexer (I believe most do already) or else parse the  
module
    until it finds the "module" keyword collecting the extension  
clauses,
    and then parse the actual module using an appropriate parser and
    lexer chosen according to the specified extensions.

What do people think? I would like to throw the idea around on the  
mailing
list before entering it into the ticket system.

Cheers,

	Pat.