[Haskell-cafe] abs minBound < (0 :: Int) && negate minBound == (minBound :: Int)

Wed Aug 21 07:36:28 CEST 2013

On 20/08/2013, at 6:44 PM, Kyle Miller wrote:
> By "working as expected" I actually just meant that they distribute (as in a(b+c)=ab+ac) and commute (ab=ba and a+b=b+a),

That is a tiny fraction of "working as expected".
The whole "modular arithmetic" argument would come close to
having some virtue, *except* that division just plain does not
fit.  In particular, it's painfully easy to find x y such that
 (x*y) div y is not congruent to x modulo 2**n.

The existence of division makes the "everything's OK because
it's modular" argument look very sick indeed.

>  The interpretation of any of these numbers being positive or negative is specious, but people still do it because it works reasonably well for small integers.

No, they do it because their programming language specifications
encourage them to do it, because their introductory courses teach
them to do it, and their compilers, on seeing x < 0, do not scream
at them "that is not well defined!", so the idea that it is
supposed to work is constantly reinforced.  And above all, "people
still do it because" in most programming languages they have no
practical alternative.

>  Also, there's the definite advantage that you can use the same instructions for adding/multiplying signed and unsigned integers (for instance, pointer arithmetic).

As the user of a programming language, what conceivable advantage
is that to me?  All the machines I have access to these days have
two sets of multiply and divide instructions, and there have been
machines with two sets of add and subtract instructions, and it is
no big deal.

The B6700 had one set of instructions (which actually dealt with both
integers and floating point).  It didn't have _any_ instructions for
unsigned integer arithmetic, and Fortran, COBOL, BASIC, Pascal, Algol,
and PL/I programmers didn't miss them.  (In particular, to a B6700
programmer familiar with his/her machine's instruction set, the idea
that variable access might have anything to do with unsigned
operations would have seemed, heck, did seem quite bizarre.)

For that matter, IBM mainframes have had, since the 1960s,
	A	signed Add
	AL	unsigned Add Logical
	S	signed Subtract
	SL	unsigned Subtract Logical
	M	signed Multiply			\ 
	ML	unsigned Multiply Logical	| these four
	D	signed Divide			| are common
	DL	unsigned Divide Logical		/
and it never stopped them being fast.  I doubt that the presence of
these instructions had any significant effect on the complexity of
the machines.  Even some 1980s single-chip machines did this.

> You mention that the B6700 trapped on overflows.  While this is a nice feature, this has nothing to do with the number format.

That's half true.  The B6700 number format was such that there was nothing
sensible they _could_ do on integer overflow.  (Look it up or trust me;
truncation would have been violently unnatural on those lovely machines.)

> One example of a nice thing about doing computations modulo 2^n is that you can do a bit twiddling trick called reciprocal multiplication (maybe sometimes called magic number multiplication).  One reference for this is at [1].  Another reference is Hacker's Delight.  But maybe you can save this for your ear's fingers.

I have a copy of Hacker's Delight within arm's reach.
This is not really something that people writing applications want.
Usually, what they need is affordable multiplication and division
that give right answers when right answers are to be had and don't
drive the program insane with rubbish when right answers are not to
be had.

There are plenty of clever things computer architects can do, up to
and including keeping a "last divisor" cache in the division unit
to accelerate divisions that reuse a recent divisor.

> I can't really say I understand why anyone would actually want to use Int (unless they knew they wanted a modulo 2^n Int).

Because they are calling an existing function that requires it.

It's just like the question "the only integral type in standard C that
*cannot* be used safely to hold an 8-bit character is char, so why
would anyone want to use char*?"  Answer: "because of all the library
functions that demand it."

> but Integer is actually (if you're using GMP with your ghc):

Yes, that's tolerably well known.  You only pay the space overhead
when you need it (like Lisp or Smalltalk).  But you always pay the
time overhead.

> Where are you getting that about C?  Do you mean that it's careful to allow implementations to decide to trap overflows?  Because as far as I can tell, the C standard just says signed overflows give undefined behavior.

The thing is that the standardisers *could* have defined signed int arithmetic
to wrap (just like the benighted Java designers did) but they *chose* to leave
the effect undefined (just like the Pascal standard) *so that* implementations
that trap on overflow (which did exist) would be allowed (again, just like the
Pascal standard).  I used to spend way too much time reading comp.std.c.
Generally, when the C standard leaves something undefined, that's because there
were already systems that did things too differently to capture in one tidy
definition.
> 
> * It's unclear to me how important simple adder and multiplier hardware is these days. But, at least two's complement doesn't require defining case-by-case behavior as I'd expect a sign-and-magnitude operations to be defined.

No.  Multiplication and division get easier, comparison at worst requires
one extra exclusive-or, and addition and subtraction require some cunning
but are doable.

For what it's worth, the descendant of Burroughs sells a descendant of the
B6700 that is actually emulated on twos-complement hardware, and still runs
fast enough (given its typically I/O bound applications) to survive in
today's market.  (Why twos-complement hardware?  Because it's mass produced
cheap commodity hardware.)