[Haskell-cafe] Unicode Haskell source -- Yippie!

Richard A. O'Keefe ok at cs.otago.ac.nz
Tue Apr 29 05:34:47 UTC 2014

Before speaking of "Apl's mistakes", one should be
clear about what exactly those mistakes *were*.
I should point out that the symbols of APL, as such,
were not a problem.  But the *number* of such symbols
was.  In order to avoid questions about operator
precedence, APL *hasn't* any.  In the same way,
Smalltalk has an extensible set of 'binary selectors'.
If you see an expression like

	a ÷> b ~@ c

which operator dominates which?  Smalltalk adopted the
same solution as APL:  no operator precedence.

Before Pascal, there was something approaching a
consensus in programming languages that
	**			tightest
	unary and binary +,-
	relational operators
In order to make life easier with user-defined
operators, Algol 68 broke this by making unary
operators (including not and others you haven't
heard of like 'down' and 'upb') bind tightest.
As it turned out, this make have made life
easier for the compiler, but not for people.
In order, allegedly, to make life easier for
students, Pascal broke this by making 'or'
and 'and' at the same level as '+' and '*'.
To this day, many years after Pascal vanished
(Think Pascal is dead, MrP is dead, MPW Pascal
is dead, IBM mainframe Pascal died so long ago
it doesn't smell any more, Sun Pascal is dead, ...)
a couple of generations of programmers believe
that you have to write
	(x > 0) && (x < n)
in C, because of what their Pascal-trained predecessor
taught them.

If we turn to Unicode, how should we read

	a ⊞ b ⟐ c

Maybe someone has a principled way to tell.  I don't.
And then we have to ask about a ⊞⟐ b ⟐⊞ c.

This is NOT a new problem.
Haskell already has way too many operators floating
around for me to remember their relative precedence,
and I have to follow a rule "when an expression
contains two operators from different 'semantic fields',
use parentheses."  Don't ask me to explain that!

Unicode does make the problem rather more pressing.
Instead of agonising over the difference between
< << <<< <<<< and the like, now we can agonise over
the difference between a couple of dozen variously
decorated and accompanied versions of the subset sign
as single characters.

Did you know that there is a single ⩵ character?
Distinct from ==?

I firmly believe that *careful* introduction of
mathematical symbols can be good, but that it needs
rather more care for consistency and readabiity than
Haskell operators have had so far.

I think wide consideration is necessary lest we end
up with things like x ÷ y where x and y are numbers
not giving a number.

> I started with writing a corresponding list for python:
> http://blog.languager.org/2014/04/unicoded-python.html

The "Math Space Advantage" there can be summarised as:
 "if you use Unicode symbols for operators you can
  omit even more spaces than you already do, wow!"

Never mind APL.  What about SETL?
For years I yearned to get my hands on SETL so that
I could write
	(∀x∈s)(∃y∈s)f(x, y)
The idea of using *different* symbols for testing and
binding (2.2, "Dis") strikes me as "Dis" indeed.  I want
to use the same character in both places because they
mean the same thing.  It's the ∀ and ∃ that mean "bind".

The name space burden reduction argument won't fly either.
Go back and look at

≤  less than or equal to in a partial order
   is a subgroup of
   can be reduced to
×  multiplication  
   Cartesian product
   cross product
   (as superscript) group of units

In mathematics, the same meaning may be represented by
several different symbols.  And the same symbol may be
used for several different meanings.

(If Haskell allowed prefix and superscript operators,
think of the fun we could have keeping track of
the Hodge dual  *v  and the ordinary dual:  v .)

Representing π as π seems like a clear win.
But do we want to use c, e, G, α, γ and other constants
with familiar 1-character names by those characters?
What if someone is writing Haskell in Greek?
(Are you reading this, Kostis?)

I STRONGLY disagree that x÷y should violate the norms of
school by returning something other than a number.
When it comes to returning a quotient and remainder,
Haskell has two ways to do this and Common Lisp has four.
I don't know how many Python has, but in situation of
such ambiguity, it would be disastrous NOT to use words
to make it clear which is meant.

I find the use of double up arrow for exponentiation odd.
Back in the days of BASIC on a model 33 Teletype, one
used the single up arrow for that purpose.

As for floor and ceiling, it would be truer to mathematical
notation to use
for floor.  (I note that in Arial Unicode as this appears
on my screen these characters look horrible.  They should
have the same vertical extent as the square brackets they
are derived from.  Cambria Math and Lucida Sans are OK.)

The claim that "dicts are more fundamental to programming
than sets" appears to be falsified by SETL, in which
dicts were just a special case of sets.  (For that matter,
so they were in Smalltalk-80.)

For existing computational notations with rich sets of
mathematical symbols look at Z and B.  (B as in the B
method, not as in the ancestor of C.)

The claim that mathematical expressions cannot be written
in Lisp or COBOL is clearly false.  See Interlisp, which
allowed infix operators.  COBOL uses "-" for subtraction,
it just needs spaces around it, which is a Good Thing.
Using the centre dot as a word separator would have more
merit if it weren't so useful as an operator.

The reference to APL has switch the operands of take
and drop.  It should be
	number_to_keep ↑ vector
	number_to_lose ↓ vector

More information about the Haskell-Cafe mailing list