[Haskell-cafe] Literate haskell format unclear (implementation and specification inconsistencies)

Fri Mar 2 06:28:11 EST 2007

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Nice, I pretty much agree with you on everything :)

Ian Lynagh wrote:
> On Wed, Feb 28, 2007 at 05:48:09PM -0500, Isaac Dupree wrote:
>> Trying to implement literate haskell[*], I realized several
>> ways in which the correct behavior for unliterating (especially with
>> regard to errors) was unclear.  I have several cases which ghc, hugs
>> and Haskell 98 have differing opinions on!  The Report as it stands
>> is far from a clear and complete specification (and I didn't find
>> anything in the Haskell' wiki/trac about literate haskell).
> 
> Hmm, some of this came up around the time the revised report was being
> written:
> 
> http://www.haskell.org/pipermail/haskell/2001-December/008549.html
> http://www.haskell.org/pipermail/haskell/2001-December/008550.html
> 
> but oddly doesn't seem to have been clarified in the report. We should
> definitely make sure that Haskell' does so!
> 
>> 1.[UnmatchedBegin]
>> If a \begin{code} starts a section of code, is \end{code}
>> _required_ before the end of the file?
> 
> I would say yes.
> 
>> 2.[AfterBeginOrEnd/{BeginWhite,EndWhite,BeginPrint,EndPrint}]
>> Can a line beginning \begin{code} or \end{code} have additional
>> stuff on the end, where the directive is understood and the
>> additional stuff is ignored?
> 
> I would say yesIffAdditionalStuffIsInvisible (although I wouldn't object
> to "no"; trailing white space makes me sad).

"No" would not be that bad if compiler error messages actually clearly
told you "You have trailing whitespace characters on line X - you may
not be able to see them, but they're there, and you should delete them!"
or something equally specific.

> And nothing may precede "\begin{code}" or "\end{code}".

Yes, this is already universally true in the Report as well as
implementations, I believe.

> 
>> 3.[IgnoringStringLiterals/{A,B}]
>> what does "(ignoring string literals, of course)" mean?
>> that the following(A) makes str = "string gap:end{code}" and an
>> unended code block(A), or that it makes an ended code block(B)?
>> (A)---------
>> \begin{code}
>> str = "string gap:\
>> \end{code}"
> 
> I didn't follow your question, but I think that in order to allow
> things to be nicely compositional
> 
>     \begin{code}
>     str = "string gap:\
>     \end{code}"
>     \end{code}
> 
> should be rejected by the unlitter for having trailing characters
> following "\end{code}". Did that answer it?

Yes, your answer is at least as clear as my question... it says that, in
order to be nicely compositional, the unlitter should not have to know
about Haskell string syntax -- which is as in case (B) except that my
example should really probably be an error anyway (as per our preferred
answer on 2.[AfterBeginOrEnd]).

> 
>> 4.[ExtraBeginEnd/{ExtraBegin,ExtraEnd}]
>> What happens if \begin{code} appears after another \begin{code}
>> before an \end{code}; and what happens if an \end{code} appears
>> without a code block previously having been started by a \begin{code}?
>> stray end:
>>    ghc, nhc98:[UNLIT_IGNORED (-> probable successful compile)]
>>          hugs:[error "\end{code} encountered outside code block"]
>> stray begin:
>>     ghc, nhc98:[UNLIT_IGNORED (-> probable syntax error)]
>>           hugs:[error "\begin{code} encountered inside code block"]
> 
> I agree with hugs.

Yes.  It would be nice if there was nothing that required unliteration
to be, semantically, top-to-bottom, and I think this answer, along with
disallowing 1.[UnmatchedBegin] and the answer on
3.[IgnoringStringLiterals], defines what is allowed clearly and
symmetrically (although the location of compiler error messages can
vary, and we don't have standards-quality wording yet :).

> 
>> 5.[LexicalUnitAcrossLiterateComment/{StringGap,BlockComment}]
>> Can lexical units jump across literate comment gaps?
>> report, ghc, hugs, nhc98: yes...
> 
> I agree.
> 
>> ghc, hugs, nhc98: think it's a fine comment
> 
> I agree.
> 
>> I mention this because allowing these makes it complicated to preserve
>> literate comments in a translation to .hs,
> 
> I don't have a problem with that; I unlit, not convertlit  :-)
> 
> Allowing them makes it easier to write an implementation in a
> compositional style.
> 
>> because, other than cases
>> like these, prefixing literate comment lines with "--  " works fine.[*]
>> However, banning these could make processing that wants to report errors
>> end up more complicated.  Maybe the report could/should say that it
>> is "not advisable", as it does for mixing '>' and {code} styles?
> 
> I don't object to saying it is inadvisable.
> 
>> 6.[TeXBirdtrack/]
>> I understand that
>> "It is not advisable to mix these two styles in the same file."
>> and the report doesn't even talk about how they mix, but now that
>> I've gotten started on the implementation inconsistencies...
>> Actually, despite the Report's advice against it, there seems to be
>> a consensus on what the meaning of mixing the two styles is, which
>> I'll describe below:
>>
>> Sensibly, ghc, hugs and nhc98 treat begin/end{code} lines as blank
>> for the purposes of '>'-style comment checking (which is that
>> a code and a non-blank literate comment line can't be adjacent);
>> this works:
>> [TeXBirdtrack/NoLayout]------------
>>> module Main where
>>> {main = print str
>> \begin{code}
>> ;str = "string"}
>> \end{code}
> 
> I don't have an opinion on whether or not this should be allowed as I
> don't think you should do it anyway, but you are right that it should be
> clearly defined.

I didn't mean to suggest that it /should be/ clearly defined, I was just
clearly defining it ;). But, it probably should be defined as long as it
is allowed, just so there is a single reference if nothing else.

> 
>> Note I didn't rely on the layout rule. This should work:
>> [TeXBirdtrack/AlignedLayout]------------
>>> module Main where
>>> main = print str
>> \begin{code}
>>   str = "string"
>> \end{code}
> 
> Again no opinion, but should be the same answer as the previous one.
> 
>> As another example, this doesn't work, for the same reason
>> that you can't start a line with '>' in a .hs file:
>> [TeXBirdtrack/Wrong]------------
>>> module Main where
>>> main = print str
>> \begin{code}
>>> str = "string"
>> \end{code}
> 
> Right, this should not be allowed.
> 
> 
> Thanks
> Ian
> 
> 

Thanks
Isaac
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFF6ApKHgcxvIWYTTURAvSjAJ9EoUsTETnPhz5wpwFBY9TA4dGmFACfebzr
oEcTkylavvxDoPxOAArqEdU=
=z2D+
-----END PGP SIGNATURE-----