[Haskell-cafe] Layout rule (was Re: PrefixMap: code review request)

Tue Feb 28 20:36:01 EST 2006

Ben Rudiak-Gould wrote:
> Brian Hulley wrote:
>> Here is my proposed layout rule:
>>
>> 1) All layout keywords (where, of, let, do) must either be followed
>> by a single element of the corresponding block type, and explicit
>> block introduced by '{', or a layout block whose first line starts
>> on the *next* line
>
> I wouldn't have much trouble adapting to that.
>
>> and whose indentation is accomplished *only* by tabs
>
> You can't be serious. This would cause far more problems than the
> current rule.

Why? Surely typing one tab is better than having to hit the spacebar 4 (or 
8) times?

>
>> I would also make it that explicit braces are not allowed to switch
>> off the layout rule (ie they can be used within a layout),
>
> I don't understand. What does "used within a layout" mean?

I meant that {;} would be used just like any other construct that has to 
respect the layout rule so you could write

       let
            a = let { b = 6; z = 77;
                h = 99;
                               p = 100} in b+z+h + p

etc but not:

       let
            a = let { b = 6; z = 77;
            h = 99;          -- this binding would be part of the outermost 
'let'
                               p = 100} in b+z+h + p

>
>> multiline strings would not be permitted,
>
> They aren't now, except with \ escapes. A stray " will be caught on
> the same line unless the line happens to end with \ and the next line
> happens to begin with \, which is exceedingly unusual.
>
>> and multiline comments would not be permitted
>> (pragmas could easily be used just by using --#)
>
> But --# doesn't introduce a comment. And this would make UNPACK
> pragmas rather inconvenient to use.

-- # but I hadn't thought about UNPACK...
The motivation in both points is to make it easy for an editor to determine 
which lines need to be re-parsed based on the number of leading tabs alone.

>
>> 1) When you see a ';' you could immediately tell which block it
>> belongs to by looking backwards till the next '{'
>
> I guess that might be helpful, but it doesn't seem easier than
> looking left to the beginning of the current line and then up to the first
> less-indented line.

There was an example posted on another thread where someone had got into 
confusion by using ; after a let binding in a do construct with an explicit 
brace after the 'do' but not after the 'let' (sorry I can't find it again). 
Also the current layout rule uses the notion of an implicit opening brace 
which is a to be regarded as a real opening brace as far as ';' in concerned 
but an unreal non-existent opening brace as far as '}' is concerned. Thus I 
think it is a real mix-up.

>
>> 2) Variable width fonts can be used,
>
> They can be used now, if you adhere to a certain style, but not
> everyone likes that style. I wrote in C++ with a variable width font and 
> tabs
> at one time, but eventually went back to fixed width. One reason was
> that I couldn't use comment layout conventions that tend (in my 
> experience)
> to improve readability more than monospacing hurts it. Another reason
> was that glyph widths appropriate to natural languages didn't work
> all that well for source code. Spaces are much more important in
> source code than in natural language, for example. A proportional
> font designed for source code would be nice, but I haven't found one
> yet. Stroustrup used a mixture of proportional and monospaced glyphs
> in _The C++ Programming Language_ and it worked well.
>> or different font faces to
>> represent different sorts of identifier eg class names, tycons, value
>> constructors, operators like `seq` as opposed to seq etc
>
> Lots of editors do this with monospaced fonts; I think it's
> orthogonal to the layout issue.

For example on Windows Trebuchet MS is a very nice font, also Verdana, both 
of which are not monospaced. But yes I agree it's not a major issue and I 
just see the option of being able to use them as a nice side-effect.

>
>> 3) Using only tabs ensures that vertical alignment goes to the same
>> position on the screen regardless of the font and tabs could even
>> have different widths just like in a wordprocessor
>
> Requiring tabs is a really bad idea. Just forget it. Seriously.

I'm really puzled here. I've been using tabs to indent my C++ code for at 
least 10 years and don't see the problem. The only problem would be if 
someone mixed tabs with spaces. Since it has to be either tabs only or 
spaces only I'd choose tabs only to save keystrokes. I suppose though it is 
always going to be a matter of personal taste...

>
>> 4) Any keypress has a localised effect on the parse tree of the
>> buffer as a whole ( { " no longer kill everything which follows and
>> there would be no {- )
>
> I don't understand why this is an advantage. If you have an editor
> that highlights comments in green, then large sections of the program
> will flash green while you type a {- -} comment, which might be
> annoying, but it also means you'll never forget to close the comment,
> so the practical benefit of forbidding {- -}, as opposed to simply
> not typing it yourself, seems nil.

But it makes it much easier for the editor to determine where to start 
re-parsing from (see below). If you allow {- everything becomes a lot more 
complicated and who needs them anyway? In Visual C++ for example, you just 
select a block of text and type Control-K Control-C to put single line 
comments at the beginning of each line in the block. I think (apart from 
UNPACK) multi-line comments are just a left-over from very old days before 
single line comments were invented and editors had these simple macros 
built-in.

>
>> 5) It paves the way for a much more immersive editing environment,
>> but I can't say more about this at the moment because I haven't
>> finished writing it yet and it will be a commercial product :-)))
>
> I guess everything has been leading up to this, but my reaction is
> that it renders the whole debate irrelevant. The only reason layout
> exists in the first place is to make source code look good in
> ordinary text editors. If you have a high-level source code editor that 
> manipulates the AST,
> then you don't need layout, or tabs, or any of that silly ASCII
> stuff. The only time you need to worry about layout is when
> interoperating with implementations that use the concrete syntax, and
> then there's nothing to stop you from exporting in any style you
> like. And when importing, there's no reason to place restrictions on
> Haskell's layout rule, because the visual layout you display in the
> editor need have no connection to the layout of the imported file.

Both 4) and 5) are because it is very much faster to type raw ASCII into an 
editor than to have to click on all kinds of boxes with the mouse. It is 
also surprisingly difficult to find an intuitive way of navigating a tree of 
boxes using only an ASCII keyboard, because of the conflict between the 
logical parent/child or sibling/sibling relationship and the way these could 
be laid out on the screen in 2d. Eg the right arrow could mean "go to the 
next sibling" but the next sibling may have to be laid out underneath the 
current node, so it all becomes very confusing. I don't think it is possible 
to lay out code in such a way that all parent/child relationships correspond 
to a vertical relationship on the screen, but possibly my thinking is too 
influenced by how programs are usually laid out.

Thus I don't think an editor that forced people to work directly with the 
AST would ever catch on. Years ago I read about a Microsoft Project on 
"intentional programming" which was about manipulating an arbitrary AST 
directly but afaik nothing came of it since it was just too painful to use. 
I also read about some research to do with deriving programs interactively 
from a proof, where clicking on boxes etc comes into its own, but I don't 
think there is yet any interactive proof system that comes close to being 
able to derive "real world" software. Certainly I'd be very interested to 
know if there is.

Currently all the ASCII editors I know of only do keyword highlighting, or 
occasional ("wait a second while I'm updating the buffer") identifier 
highlighting. What I'm trying to get however is complete grammatical 
highlighting and type checking that is instantaneous as the user types code, 
so this means that the highlighter/type checker needs a complete AST (with 
'gap' nodes to represent spans of incomplete/bad syntax) to work from.

However it is way too expensive to re-parse the whole of a big buffer after 
every keypress (I tried it with a parser written in C++ to parse a variant 
of ML and even with the optimized build and as many algorithmic 
optimizations as I could think of it was just far too slow, and I wasn't 
even trying to highlight duplicate identifiers or do type inference)

Thus to get a fast responsive editing environment which needs to maintain a 
parse of the whole buffer to provide effective grammatical highlighting and 
not just trivial keyword highlighting it seems (to me) to be essential to be 
able to limit the effect of any keystroke especially when the user is just 
typing text from left to right but where there may be more stuff after the 
cursor eg if the user has gone back to editing a function at the top of a 
file. Things like {- would mean that all the parse trees for everything 
after it would have to be discarded. Also, flashing of highlighting on this 
scale could be very annoying for a user, so I'd rather just delete this 
particular possibility of the user getting annoyed when using my software 
:-) thus my hopeless attempts to convince everyone that {- is bad news all 
round :-)))

Of course you're right that for loading and saving files I could do 
Haskell -> My representation -> Haskell but then someone reading a tutorial 
on Haskell would find that my editor (or by the above arguments, any similar 
rich-feedback editor) didn't accept all the examples...

>
>> Using my self-imposed layout rule I'm currently editing all my
>> Haskell code in a standard text editor using tabs set to 4 spaces
>> and a variable width font and have no problems.
>
> Which is the best argument for keeping the current rule! If it were
> changed as you propose, then someday Hugh Briley would come along and
> complain that Haskell's layout syntax squandered screen space---but
> he *wouldn't* be able to program in his preferred style, because it would 
> no longer be
> allowed. Religious freedom is a good thing.

Freedom has many dimensions, some interdependent:

Simplifying the syntax by using a simpler layout rule would make it possible 
for people to create very efficient incremental parsers and in turn develop 
more responsive development environments for consensus Haskell, which in 
turn might lead to more people understanding and therefore using the 
language, more libraries, more possibilities for individual people to earn a 
living themselves by programming, more people thinking things out for 
themselves instead of absorbing corporate propaganda, thus fairer laws, 
justice, liberty, and *true human freedom* for all!!! :-)

Regards, Brian.

--------------------------------------------------------

"... flee from the Hall of Learning. This Hall is dangerous in its 
perfidious beauty, is needed but for thy probation. Beware, Lanoo, lest 
dazzled by illusive radiance thy soul should linger and be caught in its 
deceptive light."
                                                   -- Voice of the Silence 
stanza 33