A HERE Document syntax
Jason Dusek
jason.dusek at gmail.com
Wed Apr 22 23:52:59 EDT 2009
The conventional HERE document operator -- `<<` -- is not a
good fit for Haskell. It's a perfectly legal user-level
operator. I'd like to propose the use of backticks for HERE
documents.
Examples
--------------------------------------------------------------
Double backtick would introduce an "indented HERE
document", one that would be bounded by the layout rules in
the same way as `do` blocks. Just like a `do`, the doubled
backticks must be followed by whitespace. For example:
rawXML = `` <a>
<b> text </b>
</a>
This is just the same as the Haskell string:
"<a>\n <b> text </b>\n</a>"
Note that the final newline -- and any final whitespace -- is
stripped.
Non-indented HERE documents are also of value; these require a
terminator as well as an initializer. Any number of backticks
greater than 2 introduces a non-indented HERE document; the
same number of backticks terminates it. All leading whitespace
up to and including the first newline is consumed; all final
whitespace is likewise consumed. For example:
usage = ````
USAGE: hello_world <name>?
Says hello, optionally to <name>.
````
This yields the Haskell string:
"USAGE: hello_world <name>?\n\nSays hello, optionally to <name>."
There is a natural relationship between quasi-quoting and HERE
documents. Thankfully, our rule about spaces gives us a way to
integrate quasi-quoters.
import Regexen(r)
matchUTCDate = ``r` \d{4}-\d\d-\d\dT\d\d:\d\d:\d\dZ
matchSource = ``r` [a-z]+\[\d+\]
We use two backticks to introduce an indented HERE document
and then specify the regular expression quasi-quote
interpreter. This is brief and clear and not too full of
demarcating cruft; it's not the usual `/[a-z]+\[\d+\]/` but
it's still pretty nice.
String Escapes & The Backslash Plague
--------------------------------------------------------------
Naturally, one desires to use character escapes in HERE
documents; however, the backslash plague is a bother and
we'd not like that so much, either.
Many Haskell escapes should be ignored. The string escapes
for double quote and newline are unnecessary; likewise, we
don't need to escape backslash. On the other hand, we can
not do without character escapes for `\BEL`, Chinese
characters and so forth. Let us adopt the rule that the 2
and 3 letter escapes are kept, as well as the hexadecimal
escapes; the "control-with-character" escapes, octal
escapes, decimal escapes and all single character escapes
save `\&` are ignored.
What if we want to document the escape rules of HERE
documents in a HERE document? We have to have a way to
"unescape" the escapes. The rule proposed is odd but simple:
escaping = ``
There comes a time when one desires to escape a magical
character such as ASCII bell (\BEL). How shall we do it?
We insert it with \BEL\&\\. \BEL\
We can insert a string of \BELs as long as they are
terminated with with a \. We enter \BEL\BEL\BEL\&\\ to
ding three times. \BEL\BEL\BEL\
As long as an escape pattern does not end with a
backslash, it is interpreted literally. The empty string
escape (\&) has an unusual power -- it is a "combo
breaker" and prevents an otherwise legitimate sequence
of escapes from being escaped. How do we output the
literal we used for \BEL\&\\? It's \BEL \BS\\& \BS\\\.
This example rings once when the first paragraph is printed
and thrice when the second is printed. It becomes the
Haskell string:
unlines
["There comes a time when one desires to escape a magical"
,"character such as ASCII bell (\\BEL). How shall we do it?"
,"We insert it with \\BEL\\. \BEL"
,""
,"We can insert a string of \\BELs as long as they are"
,"terminated with with a \\. We enter \\BEL\\BEL\\BEL\\ to"
,"ding three times. \BEL\BEL\BEL"
,""
,"As long as an escape pattern does not end with a"
,"backslash, it is interpreted literally. The empty string"
,"escape (\\&) has an unusual power -- it is a \"combo"
,"breaker\" and prevents an otherwise legitimate sequence"
,"of escapes from being escaped. How do we output the"
,"literal we used for \\BEL\\? It's \\BEL \BS\\& \BS\\\\."
]
This is the most sensible way that I can see to get rid of
the need to escape backslashes but I grant it's a bit
burdensome. I expect such advanced escaping to be rarely
used.
HERE documents are an aid to quasi-quotation and scripting.
The syntax here proposed is simple, using only backticks in an
unusual pattern, and allows for following the layout rule when
that is desired.
--
Jason Dusek
More information about the Haskell-prime
mailing list