[Haskell-cafe] Unicode Haskell source -- Yippie!

Rustom Mody rustompmody at gmail.com
Wed Apr 30 08:21:38 UTC 2014


Hi Richard
Thanks for a vigorous and rigorous appraisal of my blog post:
http://blog.languager.org/2014/04/unicoded-python.html

However this is a Haskell list and my post being not just a discussion
about python but some brainstorming for how python could change, a detailed
discussion about that is probably too off-topic here dont you think?

So for now let me address just one of your points, which is appropriate for
this forum.

I'd be pleased to discuss the other points you raise off list.

Also, while Ive learnt a lot from this thread, I also see some confusions
and fallacies.
So before drilling down into details and losing the forest for the trees,
I'd prefer to start with a broad perspective rather than a narrow
technological focus -- more at end.


On Tue, Apr 29, 2014 at 11:04 AM, Richard A. O'Keefe <ok at cs.otago.ac.nz>wrote:

> Before speaking of "Apl's mistakes", one should be
> clear about what exactly those mistakes *were*.
> I should point out that the symbols of APL, as such,
> were not a problem.  But the *number* of such symbols
> was.  In order to avoid questions about operator
> precedence, APL *hasn't* any.  In the same way,
> Smalltalk has an extensible set of 'binary selectors'.
> If you see an expression like
>
>         a ÷> b ~@ c
>
> which operator dominates which?  Smalltalk adopted the
> same solution as APL:  no operator precedence.
>
> Before Pascal, there was something approaching a
> consensus in programming languages that
>         **                      tightest
>         *,/,div,mod
>         unary and binary +,-
>         relational operators
>         not
>         and
>         or
> In order to make life easier with user-defined
> operators, Algol 68 broke this by making unary
> operators (including not and others you haven't
> heard of like 'down' and 'upb') bind tightest.
> As it turned out, this make have made life
> easier for the compiler, but not for people.
> In order, allegedly, to make life easier for
> students, Pascal broke this by making 'or'
> and 'and' at the same level as '+' and '*'.
> To this day, many years after Pascal vanished
> (Think Pascal is dead, MrP is dead, MPW Pascal
> is dead, IBM mainframe Pascal died so long ago
> it doesn't smell any more, Sun Pascal is dead, ...)
> a couple of generations of programmers believe
> that you have to write
>         (x > 0) && (x < n)
> in C, because of what their Pascal-trained predecessor
> taught them.
>
> If we turn to Unicode, how should we read
>
>         a ⊞ b ⟐ c
>
> Maybe someone has a principled way to tell.  I don't.
>

Without claiming to cover all cases, this is a 'principle'
If we have:
(⊞) :: a -> a -> b
(⟐) :: b -> b -> c

then ⊞'s precedence should be higher than ⟐.
This is what makes it natural to have the precedences of (+) (<) (&&) in
decreasing order.

This is also why the bitwise operators in C have the wrong precedence:
x & 0xF == 0xF
has only 1 meaningful interpretation; C chooses the other!
The error comes (probably) from treating & as close to the logical
operators like && whereas in fact it is more kin to arithmetic operators
like +.

There are of course other principles:
Dijkstra argued vigorously that boolean algebra being completely symmetric
in
(∨,True)  (∧, False),  ∧, ∨ should have the same precedence.

Evidently not too many people agree with him!

----------------------
To come back to the broader questions.

I started looking at Niklas' link (thanks Niklas!)
http://www.haskell.org/ghc/docs/latest/html/users_guide/syntax-extns.html#unicode-syntax

and I find that the new unicode chars for -<< and >>- are missing.
Ok, a minor doc-bug perhaps?

Poking further into that web-page, I find that it has
charset=ISO-8859-1

Running w3's validator http://validator.w3.org/ on it one gets:
No DOCTYPE found!

What has this got to do with unicode in python source?
That depends on how one sees it.

When I studied C (nearly 30 years now!) we used gets as a matter of course.
Today we dont.

Are Kernighan and Ritchie wrong in teaching it?
Are today's teacher's wrong in proscribing it?

I believe the only reasonable outlook is that truth changes with time: it
was ok then; its not today.

Likewise DOCTYPE-missing and charset-other-than-UTF-8.
Random example  showing how right yesterday becomes wrong today:
http://www.sitepoint.com/forums/showthread.php?660779-Content-type-iso-8859-1-or-utf-8

Unicode vs ASCII in program source is similar (I believe).
My thoughts on this (of a philosophical nature) are:
http://blog.languager.org/2014/04/unicode-and-unix-assumption.html

If we can get the broader agreements (disagreements!) out of the way to
start with, we may then look at the details.

Thanks and regards,
Rusi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20140430/12560687/attachment-0001.html>


More information about the Haskell-Cafe mailing list