Brian Hulley brianh at metamilk.com
Fri Sep 8 10:30:33 EDT 2006

```Cale Gibbard wrote:
> On 17/08/06, Brian Hulley <brianh at metamilk.com> wrote:
>> In contrast, a programming language should be based on general
>> concepts uniformly applied. In Haskell we have operators,
>> identifiers, prefix application using an identifier and infix
>> application using a symbol, and a uniform way to convert a symbol to
>> an identifier and vice versa, and a uniform way of forming sections.
>
> Secondly, I think it's quite a reasonable thing to do to treat unary
> negation as a separate operation. It follows quite naturally to do so
> from the definition of a ring. While having separate literals for
> negative numbers might be okay, it seems unnecessary in light of the
> fact that we *do* want a nice looking unary negation symbol, which
> doesn't strictly apply to literals. If -x suddenly became a
> non-expression, and I had to write 0-x, -1*x or (negate x) instead,
> I'd consider that a severe enough bug that I would avoid upgrading my
> compiler until it was fixed.

Leaving aside the question of negative literals for the moment, what's so
special about unary minus that it warrants a special syntax? For example in
mathematics we have x! to represent (factorial x), which is also an
important function, yet no-one is arguing that we should introduce a unary
postfix operator to Haskell just to support it.

In maths we also have |x| to denote another common function, (abs x), yet
afaia everyone is happy to just write (abs x).

Would the elimination of the special case rule for unary minus not make the
language easier to understand? What's wrong with typing (negate x) in the
rare cases where you can't just re-write the expression to use infix minus
instead (ie x + -y ===> x - y)? Surely most programs in Haskell are not just
arithmetic expressions, and while it is convenient to have infix +, -, *,
`div`, `mod` for the integers, so you can do indexing over data types and
other "counting" operations, I'd argue that the usual functional notation
(eg (exp x) (factorial x) (negate x)) should be sufficient for the other
arithmetic operations just as it's deemed sufficient for nearly everything

> In mathematics, we don't use separate symbols for negative integers,
> and negated positive integers, even though in the underlying
> representation of the integers as equivalence classes of pairs of
> naturals, we can write things like -[(1,0)] = [(0,1)], which expressed
> in ordinary notation just says that -1 = -1. This doesn't bother us,
> because the two things are always equal.
>
> Another thing to note is that all the natural literals are not, as one
> might initially think, plain values, but actually represent the
> embedding of that natural number into the ring (instance of Num), by
> way of 0 and 1. They simply provide a convenient notation for getting
> particular values of many rings, but in many cases, don't get one very
> far at all before other functions must be introduced to construct the
> constant values one wants. While there always is a homomorphism from Z
> to a ring (represented in Haskell by fromInteger), one would get
> similar expressiveness by with just the nullary operators 0 and 1, and
> the unary negation as well as addition and multiplication (albeit with
> an often severe performance hit, and some annoyance, I'm not
> recommending we really do this, simply characterising the purpose of
> numeric literals).
>
> If the performance issue regarding the polymorphic literal -5 meaning
> negate (fromInteger 5) is a problem, it would be easy enough to agree
> for the compiler to find and rewrite literals like that as fromInteger
> (-5) instead, where -5 is the precomputed integer -5. Assuming that
> fromInteger is not broken, that will always mean the same thing
> (because fromInteger is supposed to be a homomorphism). Similarly,
> when the type of (fromInteger x) is known statically to be Integer,
> the compiler can rewrite it as x. In any event, this is a tiny
> constant factor performance hit.
>
> Anyway, the point of all this is that 0,1,2... are not really literals
> at all. They're nullary operators which give particular elements of
> any given instance of Num. Perhaps at some level in the compiler after
> performing the fromInteger transformation they may be taken as literal
> integers, but there is no reason that this level has to be exposed to
> the user.

This seems very theoretical to me. In the context of programming, I don't
see the problem of just thinking of the integers as a primitive built-in
data type which contains some range of positive and negative integers which
I'd argue should all be treated on an equal footing when the context of
discourse is the integers not the naturals.

Another point is that the current treatment requires a special rule for
pattern matching against a negative integer or float, which would not be
needed if negative literals could be specified directly.

>
> Additionally, consider things like Rational. It is possible to write
> some elements of Rational in terms of integer "literals", but not all
> of them, even if negative literals become included. Floating point
> literals help a bit here, but not really all that much. (Consider
> things like 1/3, or 1/7.) In particular, any rational number with a
> denominator greater than 1 is inaccessible from that interface. Based
> on your previously mentioned design principle that all values of a
> type should be expressible via literals, or none of them should be, we
> should in fact remove the polymorphic interface for 0,1, etc. and
> force the user to type 1%1 for the rational 1. But this is annoying,
> and destroys polymorphism!
>
> I think that design principle is broken. If it was extended to say
> something like "All values of a type should be possible to write
> solely in terms of its constructors, or none of them should be.", then
> potentially infinite data structures would be excluded from having
> exposed constructors, for no good reason other than that there are
> infinite values which require other operations to define. This is, in
> a way, rather similar to the problem with rationals.

Yes I see now that that design principle appears too restrictive in general.

>
> I'd also like to say that the exponentiation example is also a good
> one. -4^2 is *always* -16, in every sane mathematical context since
> unary negation is treated as an additive operation, and thus should
> happen after exponentiation and multiplication (though under normal
> circumstances, it doesn't matter whether it's done before or after
> multiplication).

In C, it wouldn't be, since there, unary ops always bind tighter than infix
ops, and the precedences used in C are also used in C++, Java, C#,
Javascript etc, and even ISO Prolog obeys the rule that unary minus binds
tighter so making unary minus have the same precedence as infix minus just
makes Haskell syntax difficult to parse for anyone coming from one of these
other very popular languages. Imho, for better or worse, C has established a
kind of de-facto standard that unary ops always bind tighter than infix ops
in programming languages ;-)

Also, it's a good example of why we should *not* have unary minus, since the
above could be written with no ambiguity as:

negate (4 ^ 2)

or better still:

negate (expNat 4 2)

because this would free the ^ symbol for some more widely applicable use,
and would also make the particular choice of exponentiation operator more
explicit (ie ^ or ^^ - the symbols don't give much clue what the differences
between them are, only that they are both something to do with
exponentiation, whereas actual words like expNat expInt would make explicit
both the similarity and the difference between them).

>
> Though this is a little offtopic, another important thing to note
> about parsing exponentiation is that a^b^c always means a^(b^c) and
> not (a^b)^c, which is a fairly standard thing in mathematics, because
> of the tendency to automatically rewrite (a^b)^c as a^(b*c), which
> looks nicer (and wouldn't normally involve parentheses on the page),
> and that no such rule exists for the other association.
>
> While I've considered that there are reasons that requiring spaces to
> be included to separate operator symbols from their arguments might
> actually be a decent thing to have, I wouldn't recommend doing things
> in the way that you're suggesting. With that in place, we could have
> negative integer literals (provided that people really care that
> strongly), but that's no reason to drop unary negation altogether --
> just require that a space occur between the unary minus and its
> parameter. However, there are certain operators, especially
> exponentiation, and multiplication inside an additive expression,
> which putting spaces around them just looks "wrong" to me, and though
> I might be able to get used to it, I'd probably end up recompiling
> things all the time over syntax errors related to it. Newcomers to the
> language would also probably dislike it when they typed x+y at the
> ghci prompt and got some error saying that x+y is not in scope.

I don't think there is a need to force spaces to be put around every infix
application. It's only when there would be a conflict with the lexical
syntax that spaces are needed, just as at the moment we have (F . G) versus
(F.G), (f  \$  g) versus (f  \$g) etc. As long as one's preferred editor
highlights literals differently from symbols, I think it would be difficult
to not notice the distinction between "x - 2" and "x -2" if unary minus were
replaced by negative literals.

Regards, Brian.
--
Logic empowers us and Love gives us purpose.
Yet still phantoms restless for eras long past,
congealed in the present in unthought forms,
strive mightily unseen to destroy us.

http://www.metamilk.com

```