layout rule infelicity

George Russell ger@tzi.de
Thu, 30 May 2002 17:18:04 +0200


Jon Fairbairn wrote
[snip]
> Well, there's two things to consider: Haskell 98, which
> probably shouldn't change, and extended Haskell, which
> probably should. Especially if we can make the rules both
> simpler and better.
[snip]
How can I resist?  I proposed the following revised layout rule some time 
ago in a message to the Twa Simons.  Note that unlike the standard Haskell 
layout rules it does not need to read the parser's mind.  Of course the problem 
is that while it should work fine for the way I lay out Haskell, it might not 
work for other people.


We represent the lines in a file in a tree like structure:
   data Grouped line = Grouped line [Grouped line]
The meaning of 
   Grouped A lines
is a line A, followed by a list of groups, each beginning at the same deeper ind
entation.

So for example

A
  B
    C
  D

would go to something like 
   Grouped A [Grouped B [Grouped C []],Grouped D []]

In the code I've written
   A  
       B
     C
produces an error message, but on second thoughts I think the best behaviour wou
ld be to
treat it like
   A ++ B
      C
though it's too late to code that now . . .

The layout processor would group the lines according to this algorithm.  It woul
d then
output the result of the grouping.  When it came to
   Grouped first rest
it would determine if the last token of first is "do", "of", "where" or "let",
and rest does _not_ begin with a "{" token.  If both these conditions were satis
fied, it
would output "{" before, ";" inbetween elements, and "}" after when outputting t
he "rest" list.

This seems to me to solve most of the fundamental problems, and be somewhat more
 intuitive
than the existing algorithm.  It would behave differently in that
   do
      if test 
         then do
            act1
            act2
         else do
            act3 
            act4
is legal.  But it would also be necessary to alter the context-free-syntax so th
at
(1) the contents of the module were not separated by ";"'s, but by each being a 
single
    item in the [Grouped line] list.
    (The old where {decl1 ; decl2 ;  . . . ; decln} syntax would probably have
    to remain, for compatibility reasons).
(2) single-line forms without braces, like "let a = 5 in a+a" work.

This is only a first approximation, in that

   do
      if test
      then do
         act1
         act2
      else do
         act3
         act4
isn't legal.  Perhaps one way of fixing this is to modify the layout algorithm s
o that
tokens such as "then", "else", "in" and ")" before which a semicolon can't make 
any sense
anyway, get tagged onto the previous group if that began at the same column as t
hey did.

I don't claim this as the perfect solution.  But since layout is something which
 is rather
confusing and at the moment seems to have distinctly rough edges, it might be wo
rthwhile
experimenting with something like this, to see how much code it would break