[Haskell-cafe] Very freaky

Cale Gibbard cgibbard at gmail.com
Fri Jul 13 03:17:03 EDT 2007


On 12/07/07, Albert Y. C. Lai <trebla at vex.net> wrote:
> Cale Gibbard wrote:
> > As someone with a background in mathematics, I'd say that the idea of
> > mathematical symbols is to make things more concise, and easier to
> > manipulate mechanically. I'm not entirely certain that their intent is
> > to make things clearer, though often they can make things more precise
> > (which is a bit of a double edged sword when it comes to clarity). I
> > quite often try to avoid symbols as much as possible, only switching
> > to formulas when the argument I'm making is very mechanical or
> > computational. After all, in most circumstances, the reader is going
> > to have to translate the symbols back into concepts and images in
> > their head, and usually natural language is a little farther along in
> > that process, making things easier to read.
>
> I always have some beef with the common perception expressed in some of
> the above.

This is quite a long reply!

> I suppose I can speak concretely by listing again the three different
> presentations appearing above and comparing them.
>
> (A) ∀ pred : Nat → Prop .
>          pred 0
>        → (∀ n:Nat . pred n → pred (n + 1))
>        → ∀ n : Nat . pred n
>
> (B) For all statements about natural numbers,
>        if    the statement applies to 0,
>        and   if it applies to a number it applies to the next number,
>        then  it applies to all numbers.
>
> (C) Mathematical induction

This case does call for some names of things. Here's one way I might write it:

Theorem (Mathematical Induction for Naturals):
Let P be a predicate on natural numbers. If P(0) is true, and whenever
P(k) is true, we have that P(k+1) is true as well, then P(n) holds for
every natural number n.

I'm fairly sure that I prefer that to both A and B, at least for the
purposes of teaching.

>
> I have re-formatted (A) and (B) to be fair, i.e., they both receive nice
> formatting, and both formatting are more or less equivalent, dispelling
> such myths as "formal logic is unorganized" or "English is unorganized".
> I have formatted them to be equally well-organized.
>
> (C) is the clearest to those people who have already learned it and
> haven't lost it. I grant you that if you talk to a carefully selected
> audience, you just say (C) and leave it at that. But if you need to
> explain (C), saying (C) again just won't do.
>
> Most people claim (B) is clearer than (A) when it comes to explaining
> (C). One unfair factor at work is that most people have spent 10 years
> learning English; if they have also spent 10 years learning symbols, we
> would have a fairer comparison. Someone who know C but not Haskell is
> bound to say that C is clearer than Haskell; we who know both languages
> know better.

However, if they've spent 10 years on English, and, say, 6 more on
learning a formal system, it seems reasonable to take advantage of
both, wherever one outstrips the other.

> (B) is less clear than (A) on several counts, and they are all caused by
> well-known unclarity problems in most natural languages such as English.
>
> "if it applies to a number it applies to the next number" is unclear
> about whether "a number" is quantified by ∀ or ∃. It could be fixed but
> I think I know why the author didn't and most people wouldn't. If we
> actually try to insert "every" or "all" somewhere there, the whole of
> (B) sounds strange. Part of the strangeness is due to the lack of scope
> delimiters: Later in (B) there is another "all numbers" coming up, and
> two "all numbers" too close to each other is confusing. You need scope
> delimiters to separate them.

Just name your variables differently.

> English provides just a handful of pronouns: I, we, you, he, she, it,
> they, this, that, these, those. The amount is further decimated when you
> make mathematical statements (pure or applied). Any statement that needs
> to refer to multiple subjects must run into juggle trouble. In (B),
> fortunately 95% of the time is spent around one single predicate, so you
> can just get by with "it". You can't pull the same trick if you are to
> explain something else that brings up two predicates and three numbers.
> To this end, an unlimited supply of variables is a great invention from
> formal logic. (I am already using this invention by designating certain
> objects as (A), (B), (C) and thereby referring to them as such. There is
> no more tractible way.) Of course with the use of variables comes again
> the issue of scope delimiters. Actually even pronouns can be helped by
> scope delimiters.

You can use variables to name things, and you can also use quantifiers
without actually using the formal symbols for them.

> English badly lacks and needs scope delimiters. Quantifiers need them,
> variables need them, and here is one more: nested conditionals such as (B):
>      if if X then Y then Z
>      X implies Y implies Z
> are unparsible. Perhaps you can insert a few commas (as in (B)) or
> alternate between the "if then" form and the "implies" form to help just
> this case:
>    if X implies Y then Z
> but it doesn't scale. It also proves to be fragile as we insert actual
> X, Y, Z:
>
>      if it applies to a number implies it applies to the next number
> then it applies to all numbers
>
> Now, (B) is actually parsible, but that's only because I formatted it
> thoughtfully. Formatting is a form of scope delimiting, and that solves
> the problem. But formatting is tricky, as we have all learned from
> Haskell indentation rules. It is best to combine explicit delimiters
> such as parentheses with aesthetic formatting. Both the textually
> inclined and the visually inclined can parse it with ease, and (A) shows
> it. Actually, I formatted (A) first, since that's more obvious to
> format; then I copied its formatting to (B).

Of course, there are nesting limitations to English, but when they get
heavy, you will also find that people will automatically start finding
new names for things, in order to avoid having to awkwardly insert
pauses into their speech to indicate where the brackets should go. To
some extent, this is where new abstractions come from, the limitations
of our brains in holding onto arbitrarily nested constructions.

It's also possible to make (numbered) lists of conditions or results,
which makes it more apparent where each component begins and ends.
There are phrases like "The following are equivalent". And if all that
fails, you can of course still fall back on formal symbols, but you're
probably better off making a new definition or two.

> You may counter me with this part:
>
> (A') ∀ n : Nat . pred n
> (B') pred applies to all numbers

Not really, I actually like that variable to be there sometimes too.

> Surely the introduction of the variable n in (A') is an extra burden,
> and (B') streamlines it away? Yes, but (A') is not the best expression
> possible in formal logic anyway. Following the formal logic of HOL
> (hol.sf.net) and many others, ∀ is just another polymorphic operator:
>      ∀ :: (a -> Prop) -> Prop
> and furthermore (A') is just syntactic sugar for
>      ∀ (λn : Nat . pred n)
> which is
>
> (A*) ∀ pred
>
> Now it is streamlined: It only mentions the two essences of (B'): about
> pred, and being universally true. No unnecessary variable. If you also
> want it to say something about Nat, just throw in some types.

I must admit that I'm far more used to first order logic with some set
theoretic axioms stacked on top of it.

> Cale may counter that I have only been addressing precision, not
> "clarity" as he defined by ease of reading and inducing images in heads.
> Besides replying him with "if it doesn't have to be correct, I can write
> one that's even easier to read" (with apology to Gerald Weinberg), let
> me play along that definition just a bit. What image should I form in my
> head, if "a number" may be universal or existential and I don't know
> which to image, if "it" juggles between three subjects and I don't know
> which to image at which instance, and if the whole thing lacks scoping
> so much that it is not even parsible?

The point is that if that's really imprecise in the sense that the
intended listener might mistake it to mean something else, then the
text should certainly be improved, but it's easy to use quantifiers in
mathematical English too. We can say "for all numbers n" or "there
exists a number n", or some variation meaning one of these. We can
also say things like "Let n be a number", or "Suppose there were an n
such that P(n)."

> As I write in another post:
>
> "I think of formal logic as clarifying thought and semantics, cleaning
> up the mess caused by idiosyncracies in natural languages (both syntax
> and semantics) such as English. But not many people realize they are in
> a mess needing cleanup."
>
> There are two ways to clean up anything. Take governments for example.
> If a government is corrupted, you could help it and fix it up, or you
> could start a revolution and remove it.
>
> Natural languages are also corrupted when it comes to clarity of
> thought, reasoning, and semantics. I take the extremist approach of
> siding with the revolutionaries, and that is starting over with a brand
> new language divorcing natural words, i.e., formal logic. I do not
> invest hope in fixing up natural languages because of these sentiments:
> natural languages are too fragile to fix, natural words are too
> overloaded to be safe to reuse, and if you fix it up it will look just
> like formal logic with bloat. It is also a benefit, not a drawback, to
> learn a different language and broaden one's mind; it is a drawback, not
> a benefit, to avoid a different language and narrow one's mind. It is
> also a benefit, not a drawback, to keep natural languages around for its
> strength: poems, puns, visions, sweeping generalizations; but do let's
> use a different language for a different task: a logical one for a
> logical task.

Yeah, so let's suppose you don't have a blackboard handy, let's say
you're talking to your friend on the phone. How do you convey
mathematical statements? I certainly hope that you don't describe each
character in turn! You'll probably render the logic into some spoken
form, which is quite likely to be just the sort of thing which I'd
tend to write down in common use.

Don't get me wrong, when it's appropriate, I'll actually use explicit
quantifier symbols (∀,∃), and symbolic forms of things like "implies",
"and", "or", and so on, but usually this is when either I'm simply
unravelling some definitions, or I feel that a particularly
symbolic/computational approach to logic seems to be the best level at
which to consider the given statements. Usually though, I like to keep
the reader's mind focused on the subject matter at hand, and not the
formal logic framework that we happen to be using. When you're
specifically doing mathematical logic, of course it's more appropriate
to use the symbolic form, since the whole point is in the manipulation
and discussion of symbolic sentences. When you're doing geometry or
combinatorics, it's generally inappropriate to get involved in
minutiae like that too often.

> Textually-inclined people feel at home with formal logic.
> Visually-inclined people are disadvantaged, but they are just as
> disadvantaged with textual natural languages. What they need are visual
> languages. Formal logic enjoys a more well-defined grammar and is a
> better basis than natural languages for developing visual languages.

I'm not entirely sure this last claim is true. For me at least, the
visual language which I use to reason about mathematical objects is
almost entirely divorced from the original English or formal
statements, except that each of the components readily has an
interpretation in terms of them.

Let's take a simple example from the context of topology. Consider the
perhaps still somewhat vague statement: Let S be a surface obtained by
removing an open disc from each of the torus and the projective plane,
and gluing a cylinder between the two circles at the boundaries of the
removed discs.

This is a translation into English of some shapes in my head.

Note that this involves no explicit maps of any kind. I don't even
really mention that there's a quotient, except for using the word
'gluing'. Let's assume we don't yet have some operation which does
this all formally. You can probably still see quite clearly what I
want, geometrically. You can also probably formalise it for yourself,
perhaps in even more generality than this specific case. I would say
that, among the right audience, it does a fairly adequate job of
defining a surface up to homeomorphism, putting it in the heads of the
people in the room. I can go on to talk about paths in that space
fairly adequately, and even perhaps draw a picture of it without
people shouting that it's not at all the picture that they had (at
least insofar as I can draw a recognisable picture of the projective
plane!). If I *do* happen to have the operation which does this
formally, then by saying what I've said above, I convey that I want
that operation to those people who are aware of it, and I convey
something which is still reasonably meaningful to those who don't --
they might even be able to determine what the formal definition must
be just from that.

On the other hand, if I'm reasonably sure that everyone at hand is
aware of the formalism, I might say: "Let S be the surface obtained by
taking the connected sum of the projective plane and the torus."

Or even the symbolic: "Let S = P # T", requiring a little more
context, of course.

These are both far more concise and far more opaque to those
unfamiliar with the context.

To define the connected sum more formally would take just a little
more time and care, which may or may not be worthwhile depending on
how detailed a discussion one wants to get into. Obviously, there are
many circumstances which warrant that, and any extended period of work
would. As an introduction to the whole idea of the connected sum,
studying an example similar to this might be useful before moving to
the general idea.

Even when doing so, writing the definition of the connected sum in
terms of formal logic will generally not make it clearer. In any
proper treatment though, the quotient and the isomorphism between the
two circles would be made explicit. I might remove the cylinder from
the formalisation, since it is mostly there as a visual aid rather
than something which is mathematically necessary. I might even manage
to bore people by spelling the details out though. 'Oh, come on, we
know what you mean already! It was enough to say "glue the circles
together!"'

My point is not that the formalism shouldn't be there, somewhere. It's
just that writing it down is often an exercise in tedium, for any but
the most primitive of tasks. If you don't believe it, have a look at
metamath.org. While that level of rigour is a worthy goal as a
separate project, it's not something you'll ever convince any
mathematician to do on an everyday basis in their own work, unless the
person in question is a hardcore logician. Even then, it's likely
going to be an uphill battle. The reason is that this sort of
translation into the mechanical is usually extremely boring! It can be
complicated certainly, but usually long before you've finished,
perhaps even before you've begun, you've convinced yourself that it's
doable. The rest is busy work, and typically doesn't serve to
elucidate anything apart from possibly some details about the
formalism in question rather than the theorem you're embedding into
it.

My algebraic topology professor had a word for the sort of language
which is typically used, and what most mathematicians are aiming for:
rigourisable. Specifically, the goal is to pin enough details down so
that anyone in the audience could sit down and, with knowledge of a
particular formalism, turn it into something that a computer would
accept. Humans are at present much better than machines at bridging
all the small gaps in a consistent fashion (like applying the
appropriate isomorphisms and so on), and there's little reason that we
should suffer to communicate with each other at their level.

On the other hand, for many modern papers, putting some extra details
somewhere seems like something worthwhile, and you might even be able
to get graduate students to do a bit of it. Writing everything into a
formal, machine-checkable system still seems pretty unlikely though.

There's also a good philosophical reason for working in a way which is
slightly detached from any particular formalism when possible -- any
particular formalism for mathematics might happen to be inconsistent.
The normal level of detachment from formalism would cushion many
results from the shock-waves an inconsistency would send through every
branch of mathematics, as it is quite likely that many things will
still have justifiable translations in terms of whatever new system
people would then come up with. Things written directly at the level
of the formalism would be comparatively brittle, and possibly need
translation up into softer language and then back down into the new
system.

Anyway, that's just my own perspective on the matter. Sorry for going
on for so long in this somewhat off-topic discussion. :)


More information about the Haskell-Cafe mailing list