[Haskell-cafe] Int is broken [Was: Different answers on different machines]

Tue Jun 4 06:22:23 CEST 2013

On Tue, Jun 4, 2013 at 7:35 AM, Richard A. O'Keefe <ok at cs.otago.ac.nz>wrote:

>
> On 3/06/2013, at 6:58 PM, Carter Schonwald wrote:
> > If the Int type had either of these semantics by default, many many
> performance sensitive libraries would suddenly have substantially less
> compelling performance.  Every single operation that was branchless before
> would have a branch *every* operation..... this would be BAD.
>
> Actually, the x86 can be configured to trap integer overflows,
> so on that not entirely unpopular platform, there need be NO
> extra branches.
>

Well yes and no. See http://software.intel.com/en-us/forums/topic/306156
Using instructions like cmovo "Conditional MOve on Overflow" we can test
without a branch -- so in that sense yes.

No, because the use of the word 'trap' is a bit misleading. If we
understand 'trap' as synchronous interrupt, then intel provides the
infrastructure to literally trap floating point errors but for integers
such a 'trap' only works if the instruction stream contains instructions
like INTO or CMOVO etc.

>
> Alternatively, and more portably, there could be separate
> Int and UnsafeInt types, and the performance sensitive libraries
> could be rewritten to use UnsafeInt.
>
> For just one week, I had the joy of using a C compiler where
> signed integer overflow was trapped.  It was *wonderful*.
>
>
In Discipline of Programming (in 1976!) Dijkstra exactly described this
problem, and squarely put the blame on poorly engineered machines.
He introduced 3 concepts/terms:
UM : unbounded machine
SLM : sufficiently large machine
HSLM : hopefully sufficiently large machine

The UM -- like a Turing machine -- has no messy restrictions of finiteness
like wordsize and is therefore pleasant to reason with and impossible to
physically build.

The SLM is like most of our actual machines -- actual finite state machines
approximating our conceptually nice unbounded machines. The problem is when
the approximation fails, the SLM behaves unpredictably.

So we have the HSLM, which (I quote):

The HSLM is two things merged into one. Besides acting as the largest SLM
we can afford, it checks, when called to execute a program, as the
computation proceeds, whether this SLM is large enough for the current
computation.  If so, it proceeds with the simulation of the UM's behaviour,
otherwise it refuses to continue.

There exist, regretfully enough,in which the continuous check that the
simulation of the behaviour of the UM is not beyond their capacity is so
time-consuming, that the check is suppressed for the sake of efficiency.

It is very difficult to use such machines… and we ignore them in the
sequel!!

In short the problem is our machines: if catching errors involves checking
and checking involves a cost, some program(ers) will sometimes seek to
avoid that.
Moving the check to the hardware -- ie synchronous trap on errors --
removes the cost and the temptation to avoid.

Until we get such machines, these arguments will continue to be there!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20130604/ea68c448/attachment.htm>