A HERE Document syntax

Jason Dusek jason.dusek at gmail.com
Wed Apr 22 23:52:59 EDT 2009


  The conventional HERE document operator -- `<<` -- is not a
  good fit for Haskell. It's a perfectly legal user-level
  operator. I'd like to propose the use of backticks for HERE
  documents.


                                                        Examples
   --------------------------------------------------------------

    Double backtick would introduce an "indented HERE
    document", one that would be bounded by the layout rules in
    the same way as `do` blocks. Just like a `do`, the doubled
    backticks must be followed by whitespace. For example:

      rawXML                 =  ``  <a>
                                      <b> text </b>
                                    </a>

    This is just the same as the Haskell string:

      "<a>\n  <b> text </b>\n</a>"

    Note that the final newline -- and any final whitespace -- is
    stripped.

    Non-indented HERE documents are also of value; these require a
    terminator as well as an initializer. Any number of backticks
    greater than 2 introduces a non-indented HERE document; the
    same number of backticks terminates it. All leading whitespace
    up to and including the first newline is consumed; all final
    whitespace is likewise consumed. For example:

      usage                  =  ````
USAGE: hello_world <name>?

Says hello, optionally to <name>.
                                ````

    This yields the Haskell string:

      "USAGE: hello_world <name>?\n\nSays hello, optionally to <name>."

    There is a natural relationship between quasi-quoting and HERE
    documents. Thankfully, our rule about spaces gives us a way to
    integrate quasi-quoters.

      import Regexen(r)

      matchUTCDate           =  ``r` \d{4}-\d\d-\d\dT\d\d:\d\d:\d\dZ
      matchSource            =  ``r` [a-z]+\[\d+\]

    We use two backticks to introduce an indented HERE document
    and then specify the regular expression quasi-quote
    interpreter. This is brief and clear and not too full of
    demarcating cruft; it's not the usual `/[a-z]+\[\d+\]/` but
    it's still pretty nice.


                           String Escapes & The Backslash Plague
   --------------------------------------------------------------

    Naturally, one desires to use character escapes in HERE
    documents; however, the backslash plague is a bother and
    we'd not like that so much, either.

    Many Haskell escapes should be ignored. The string escapes
    for double quote and newline are unnecessary; likewise, we
    don't need to escape backslash. On the other hand, we can
    not do without character escapes for `\BEL`, Chinese
    characters and so forth. Let us adopt the rule that the 2
    and 3 letter escapes are kept, as well as the hexadecimal
    escapes; the "control-with-character" escapes, octal
    escapes, decimal escapes and all single character escapes
    save `\&` are ignored.

    What if we want to document the escape rules of HERE
    documents in a HERE document? We have to have a way to
    "unescape" the escapes. The rule proposed is odd but simple:

      escaping               =  ``
        There comes a time when one desires to escape a magical
        character such as ASCII bell (\BEL). How shall we do it?
        We insert it with \BEL\&\\. \BEL\

        We can insert a string of \BELs as long as they are
        terminated with with a \. We enter \BEL\BEL\BEL\&\\ to
        ding three times. \BEL\BEL\BEL\

        As long as an escape pattern does not end with a
        backslash, it is interpreted literally. The empty string
        escape (\&) has an unusual power -- it is a "combo
        breaker" and prevents an otherwise legitimate sequence
        of escapes from being escaped. How do we output the
        literal we used for \BEL\&\\? It's \BEL \BS\\& \BS\\\.

    This example rings once when the first paragraph is printed
    and thrice when the second is printed. It becomes the
    Haskell string:

      unlines
      ["There comes a time when one desires to escape a magical"
      ,"character such as ASCII bell (\\BEL). How shall we do it?"
      ,"We insert it with \\BEL\\. \BEL"
      ,""
      ,"We can insert a string of \\BELs as long as they are"
      ,"terminated with with a \\. We enter \\BEL\\BEL\\BEL\\ to"
      ,"ding three times. \BEL\BEL\BEL"
      ,""
      ,"As long as an escape pattern does not end with a"
      ,"backslash, it is interpreted literally. The empty string"
      ,"escape (\\&) has an unusual power -- it is a \"combo"
      ,"breaker\" and prevents an otherwise legitimate sequence"
      ,"of escapes from being escaped. How do we output the"
      ,"literal we used for \\BEL\\? It's \\BEL \BS\\& \BS\\\\."
      ]

    This is the most sensible way that I can see to get rid of
    the need to escape backslashes but I grant it's a bit
    burdensome. I expect such advanced escaping to be rarely
    used.


  HERE documents are an aid to quasi-quotation and scripting.
  The syntax here proposed is simple, using only backticks in an
  unusual pattern, and allows for following the layout rule when
  that is desired.

--
Jason Dusek


More information about the Haskell-prime mailing list