[Haskell-cafe] accessible layout proposal?

Wed Sep 23 20:11:54 EDT 2009

> [PPIG: this is the middle of a discussion about style in the
programming language Haskell and the degree to which one could
improve things by overloading newline.]

> This proposal notwithstanding, I find 'concat [a, b, c]' much more
> readable than (a) ++ (b) ++ (c),

Yes, but now you've changed it again.  The parentheses are not
needed that often.  The thing about concat [a,b,c] is that the
operation is HIDDEN.  Suppose for the sake of example (and I
repeat that non-trivial examples are what we _really_ need)
that you have something that needs to be broken across multiple
lines, then
	(   e1
	++  e2
	++  e3
	++  e4
	)
tells you on (nearly) *every line* what the ---- is going on,
whereas
	concat [
	    e1
	,   e2
	,   e3
	,   e4
	]
does not.

I don't find parentheses a problem.  I *do* find lots of
code looking identical but meaning very different things
to be a problem.

> Furthermore,
> why repeat the (++) over and over again,

(A) To remind the reader over and over again
(B) In fact such expressions often mix ++ and :
     which in concat[] requires you to add square brackets
(C) "++" isn't _that_ much harder to type than ",",
     so there isn't any real downside to making my code clearer.

> when really you are making a
> *list* of things to add,

No, I'm really *NOT*.  If I write a++b++c, the list [a,b,c]
is *not* what I want to make; it is no interest, it is at
best a distraction from what's *really* supposed to be
going on.  I would find it absurd to write

	sum [
	    e1
	,   e2
	,   e3
	]

instead of e1 + e2 + e3 (and just as concatenations often mix
++ and :, so sums often mix + and -).  I'm not worried about
efficiency here:  perhaps mistakenly I trust the Haskell
compiler to unfold "concat [a,b,c]" or "sum [a,b,c]".  What
bothers me is misleading any human beings who have the
misfortune to have to try to read the code.

I don't hesitate to use concat [...] or sum [...] or the
like when [...] is a list comprehension, because I don't
know a clearer way to do that.

Turning to the new proposal being developed,
suppose one wrote

	data Tree k v			-- V2
	     Empty
	     Fork
		 key   :: k
		 value :: v
		 left  :: Tree k v
		 right :: Tree k v

instead of
	data Tree k v			-- V1
	   = Empty
	   | Fork {
		key   :: k
	      , value :: v
	      , left  :: Tree k v
	      , right :: Tree k v
	      }

This could indeed work.  Indeed, I have seen a formal
specification language where records were declared using
layout (but not sum-of-product types) not unlike this.
Here we have newline overloaded to mean "or" (Empty or
Fork) and "and" (key and value and left and right).

It could *work*, and it would be less typing for the
*author*, but what would it do for the *reader*?
The leading punctuation marks "=" "|" "," give the
reader an immediate and unmistakable clue about what
role the rest of the line plays.

I'd actually prefer to write
	data Tree k v			-- V3
	   | Empty
	   | Fork {
	      , key   :: k
	      , value :: v
	      , left  :: Tree k v
	      , right :: Tree k v
	      }

What we really need here is somebody who is willing and
able to conduct some experiments to compare the readability
of these and other ways of writing data declarations.
*I* prefer V3 to V1 to V2, but since my argument is that
readers need more help than writers, experimental evidence
that V2 is better for others than V1 or V3 would persuade
me to adopt it.

For this reason I've cc'ed this message to the psychology
of programming interest group mailing list.