Isaac Dupree isaacdupree at charter.net
Sat Mar 3 12:18:44 EST 2007

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Isaac Dupree wrote:
> we don't have standards-quality wording yet

Okay, here's a first attempt at formalizing it. It's really messy yet,
and doesn't incorporate narrative material from the Haskell 98 literate
comments section yet.  Feedback so far?

Literate programs are interpreted as a series of lines that are parsed,
and each line is transformed to another line that is part of the program
text.

Here are some (String -> Bool) to test lines during parsing.

BIRDPROG = begins with ">"
BEGIN = "\begin{code}" ++ (all isSpace)
END = "\end{code}" ++ (all isSpace)
TEXPROG = doesn't begin with "\begin{code}" or "\end{code}"
BLANKLINE = all isSpace
COMMENTLINE = doesn't begin with ">", "\begin{code}" or "\end{code}" and
not all isSpace
BORINGLINE = BLANKLINE | COMMENTLINE = doesn't begin with ">",
"\begin{code}" or "\end{code}"

If we want to incorporate the "no comment lines adjacent to program
lines" into the syntax here, we have
comment = one or more COMMENTLINE
birdprog = one or more BIRDPROG
birdOrComment = birdprog | comment
texprog = BEGIN (zero or more TEXPROG) END
notBirdOrComment = BLANKLINE | texprog
file = (zero or more notBirdOrComment) birdOrComment (zero or more ((one
or more notBirdOrComment) birdOrComment)) (zero or more notBirdOrComment)

which is a little ugly compared to
file = zero or more (BORINGLINE | BIRDPROG | texprog)
texprog = BEGIN (zero or more TEXPROG) END

(Neither one expresses the requirement (not explicitly present in
Haskell 98) that there must be at least one actual program line in the
result.  Of course, if there isn't such a requirement, the file would
mean "module Main(main) where {}" which is in error anyway.)

Lines judged to be BIRDPROG have the initial ">" replaced with a " ".
Lines judged to be TEXPROG are retained intact.  All other lines are
reduced to emptiness.

It is not advisable to have a single lexical unit ("lexeme" (which only
includes "gap" for these purposes) or "ncomment" --- references from
Haskell98 section 9.2 Lexical Syntax) that crosses a line that was a
COMMENTLINE. (or, unadvisable to cross ANY line other than BIRDPROG and
TEXPROG, because doing so is just weird anyway.)

Isaac
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFF6a3zHgcxvIWYTTURAiqtAKCAXtNFueCxsTRNJpuSYuPL+6On4wCeJNQV
Teswf08Pr0senEaeNRNvJsw=
=ZiJL
-----END PGP SIGNATURE-----