The dreaded M-R

John Hughes rjmh at cs.chalmers.se
Thu Jan 26 04:59:22 EST 2006


I had promised myself not to propose anything that was not already 
implemented,
tried, and tested, but I just can't resist bringing up a proposal I've 
made in the
past to fix the monomorphism restriction. Maybe now is the time to do so.

I know many will proclaim "just get rid of it", but consider this: 
without the
M-R, some programs can run exponentially slower than you expect. This
actually happened to me, which is how we discovered something of the sort
was needed in the first place. But the current design is surely a wart.

The fact is, Haskell has two different binding mechanisms--bind-by-name
(used for overloaded definitions), and bind-by-need (monomorphic). Both
are useful: bind-by-name lets us name overloaded expressions, while
bind-by-need gives performance guarantees. The trouble is just the way
we distinguish them--where the compiler is basically guessing from the
form of a definition which one to use. Two problems that leads to:

* You can't eta-convert definitions freely, if there is no type signature.
  We've all hit this one, where you write something like sum=foldr(+)0
  and you can't export it, because it's monomorphic.

* Definitions without a type-signature can change from polymorphic
  to monomorphic as a result of changes elsewhere in the program.
  Because the M-R applies only to overloaded definitions, then introducing,
  for example, an equality test in a function the definition calls can
  change its type, and make the M-R suddenly apply where it did not before.
  That can lead to unexpected errors.

The solution I favour is simply to use *different syntax* for the two
forms of binding, so that a definition is monomorphic, and computed
at most once, if it uses the monomorphic binding operator, and
polymorphic/overloaded, computed at each use, if it uses the other.
Whether it's a function definition or not is irrelevant, as is whether
or not it carries a type signature.

The trick is finding good syntax. I suggest = for bind-by-name, and
:= for bind-by-need. (Some object that := "means" assignment--but come
on, we're not reserving := for future use as assignment in Haskell, are we?
Why should we give up a perfectly good symbol because it's used elsewhere
to mean something else?). With this notation, = would be appropriate
for function definitions, and := for most non-function definitions. It
would be instantly clear where there was a possibility of repeated
evaluation, and where not.

The problem with making such a syntactic distinction is that, however
it's done, many changes must be made to existing programs. Just because
existing programs contain many bindings of each sort, there's no
getting away from the fact that a syntactic distinction will force
changes. In principle this could be automated, of course--not hard
but somebody would have to do it. But perhaps it would be worth it,
to eliminate probably the number one wart, and solve the problems
above.

I put it on the table, anyway.

John



More information about the Haskell-prime mailing list