What separates lines in Haskell code?
isaacdupree at charter.net
Sun Jun 17 13:36:19 EDT 2007
-----BEGIN PGP SIGNED MESSAGE-----
Antti-Juhani Kaijanaho wrote:
> On Thu, Jun 14, 2007 at 09:11:12AM -0400, Isaac Dupree wrote:
>> In the report, under the layout rule (section 9.3), "The characters
>> newline, return, linefeed, and formfeed, all start a new line." (Which
>> four characters are those? from http://en.wikipedia.org/wiki/Linefeed ,
>> I'm guessing "LF: Line Feed, U+000A", "CR: Carriage Return, U+000D",
>> "FF: Form Feed, U+000C", and what's the fourth one? Newline usually
>> refers to '\n', which is LF, but linefeed has a direct name
>> correspondence to that also!)
> The H98 lexical syntax defines newline as
> newline -> return linefeed | return | linefeed | formfeed
> It could, I suppose, also refer to the Unicode character U+2028 LINE SEPARATOR,
> but then probably U+2029 PARAGRAPH SEPARATOR ought to be included as well.
> There are, BTW, Unicode guidelines for newline usage in section 5.8 of the
> Unicode 5.0 online edition.
Alright, I think the comment in the layout-rule section should not try
to enumerate newlines, but rather should refer back to the lexical
definition of 'newline'.
As per the above Unicode guideline, the existing set of characters that
Haskell98 accepts as newlines, and a section of the Unicode regex
guidelines <http://unicode.org/reports/tr18/>, I propose all should be
accepted as line separators:
\u000A | \u000B | \u000C | \u000D | \u0085 | \u2028 | \u2029 | \u000D\u000A
i.e. (not in the same order) CR, LF, CRLF, NEL, VT, FF, LS, PS.
Unfortunately that makes it a little hard to process; maybe translate
all into '\n' before doing any processing (such as unliteration).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
-----END PGP SIGNATURE-----
More information about the Haskell-prime