Specifying language extensions
Patryk Zadarnowski
patrykz at cse.unsw.edu.au
Thu Feb 2 19:11:31 EST 2006
One issue that pains me with Haskell 98 is that it does nothing about
one of its
original stated goals as a programming language.
I've always been very fond of pointing out that Haskell has been
designed as a
language for EXPERIMENTATION with language features, and therefore the
mind-boggling number of extensions in GHC is a good thing, not a bad
one.
This will NOT change with Haskell' - we already know that at least
one language
feature which is in high demand (functional dependencies) is far from
ready for
standardization.
However, the language, at the current point, does nothing towards
helping with
keeping track of the language extensions in use by a particular piece
of code.
To make things worse, many extension require lexical and/or syntactic
changes to the language. Some extensions in the past proved to be
incompatible,
and not all compilers implement all extensions.
So, looking at a particular piece of code, how do we know which compiler
it can be compiled with, and which command-line flags do we pass to
which
compiler? The answer is, simply, we don't.
Currently, we do this by introducing ugly GHC-specific pragmas, comments
in the documentation, and a plethora of similar half-measures.
I would like to propose a language feature to address this concern
once and
for all. I know that a similar proposal has been circulated on the
haskell
mailing list in the past, but to the best of my knowledge, none of
the current
tickets addresses this problem.
The proposal:
Add explicit syntax for documenting language extensions required by a
module.
The Problem:
* Current language does not provide a uniform extension mechanism.
* Pragmas (as currently used in GHC) are NOT suitable for this.
Specifically,
by their design they are COMMENTS, and therefore should have no
impact on
the language semantics. Unrecognized pragmas should be ignorable
without
warnings, while, currently, if you omit the GHC options pragma
for a module
that requires some syntactic extension, you will receive
thousands of lexical
and syntactic errors completely unrelated to your extension.
* I /strongly/ believe that compiler pragmas should ONLY be used for
extensions (such as INLINE, various SPECIALIZE, etc.) that have
no impact
on the program semantics.
The Solution:
Add an "extension" clause to the language.
To avoid introduction of a new keyword, I propose to use the
following syntax:
module -> [extensions;] "module" modid [exports] "where" body
| body
extensions -> extension_1 ; ... ; extension_n
extension -> "import" "extension" varid [extparam_1 ... extparam_n]
extparam -> varid | conid | literal
A module using some GHC woo-hah extension would look like:
import extension GHC "woo-hah!"
module Foo ...
Or, if an extension is common enough:
import extension fundeps
module Foo ...
This is a very conservative syntax. It does not support grouping,
aliasing and
renaming of extensions (as previously circulated on the haskell
mailing list)
which I personally believe would be very a bad idea. I want to be
able to look
at a piece of Haskell code and tell immediately which extensions it
uses,
without being forced to browse through the included modules, etc.
Extensions would NOT be exported by a module that uses that
extension, but would
have to be specified separately by each module that uses the
features provided
by that extension. For example, one often hides uses of unboxed
types, functional
dependencies, etc, behind a curtain of abstract data types, and
such data type
implemented using non-standard features can be happily used within
standard-conforming
Haskell programs. On the other hand, if an extension is visible in
an interface
exported by a module, it has to be named explicitly (with "import
extension" clauses)
by any module importing that interface.
Extensions could be parametized, since I can readily imagine
extensions that
would require such a thing. I would also recommend in the standard
that every
compiler groups its own extensions under a common name (for
example, GHC, HUGS,
JHC, NHC, etc.) until they are in sufficiently common use to be
standardized
independently (such as fundeps), at which stage there should
probably be a
corresponding addendum to the standard for that extension.
Specifying extensions before the "module" keyword ensures that the
lexer and
parser can find them before parsing of the actual module. I
recommend that
bare modules without the "module" keyword cannot specify any
extensions, and
therefore must be written in pure Haskell'.
The standard itself should not define any extensions and state that
the
standard grammar and semantics describes the base language in
absence of
any "import extension" clauses.
Each extension, including FFI, should be described in a separate
addendum.
Pros:
* Addresses a pending language design goal.
* Useful for automatic documentation tools such as Haddock, which
could
even generate a hyperlink from an extension name to the relevant
addendum when available.
* Simple to implement.
* Neat syntax.
* Backwards-compatible (introduces no keyword polution.)
* Makes all information required to compile a program available
directly
in the source code, rather than in ad-hoc places such as command-
line,
Cabal package descriptions, documentation, comments, pragmas and
what-not.
* Returns all comments (including pragmas) to their original code as
semantically-neutral annotations on the source program.
Cons:
* Some implementation hassles. The compiler must use either a
predicated
parser and lexer (I believe most do already) or else parse the
module
until it finds the "module" keyword collecting the extension
clauses,
and then parse the actual module using an appropriate parser and
lexer chosen according to the specified extensions.
What do people think? I would like to throw the idea around on the
mailing
list before entering it into the ticket system.
Cheers,
Pat.
More information about the Haskell-prime
mailing list