[Haskell-cafe] Serialization of (a -> b) and IO a
dan.doel at gmail.com
Thu Nov 11 09:36:38 EST 2010
On Thursday 11 November 2010 6:22:06 am Sjoerd Visscher wrote:
> What I'm wondering is if it would actually break things if serialize would
> not produce the same String for these functions.
Yes. It would break the (usual) mathematical underpinnings of Haskell.
> The reasoning above is used regularly to shoot down some really useful
> functionality. So what would go wrong if we chose to take the practical
> path, and leave aside the theoretical issues?
You would lose many uses of equational reasoning in your programs. Have you
every substituted 'x * 2' for the expression 'x + x' in one of your programs,
or vice versa? You can no longer do that, because someone may be serializing
the function you're writing, checking how it's implemented, and relying it.
You'd lose the whole notion of 'the category of haskell types and functions'
goodbye, too. Does f . id = f? Not if the former serializes as "f . id". We
already have a weak case of this, since (\x -> undefined x) can be
distinguished from undefined using seq, but that can be hand-waved away by not
worrying about bottoms so much. That isn't going to work for serialize. There
does exist practical Haskell stuff from category theory inspired people.
As a digression... When you get into things like dependent type theory,
equality is actually incorporated into the language in a much more direct way.
And there are various sorts of equality you can add to your type theory. Two
common choices are:
intensional equality: two values are provably equal if they evaluate to the
same normal form
extensional equality: this incorporates non-computational rules, like the
point-wise equality of functions.
Now, in a type theory where equality is intensional, I cannot prove:
(forall x. f x = g x) -> f = g
However, both these equalities (and others in between and on either side) are
*compatible*, in that I cannot disprove the above theorem in an intensional
What seq and serialize do is break from extensional equality, and allow us to
disprove the above (perhaps not for seq within a hypothetical theory, since
the invalidating case involves non-termination, but certainly for serialize).
And that's a big deal, because extensional equality is handy for the above
reasoning about programs.
> Also, the above two functions might not be exactly denotationally equal if
> the type is (Float -> Float), or the speed or memory use might be
> different. Could it not be that requiring them to be equal could just as
> well break things?
I would not be at all surprised if they are unequal for Float -> Float,
considering how ugly floating point types are. That doesn't justify breaking
equality for every other type.
Speed and memory use are typically disregarded for equality of (expressions
denoting) functions. Merge sort and bubble sort are two algorithms for
computing the same function, even though they vary wildly in these two
characteristics. If you want a calculus for precisely reasoning about resource
usage, then you would not necessarily be able to consider these equal.
But, Haskell is simply not such a calculus, and I don't think, "what if it
were," is a good argument for actively breaking extensional equality in such a
dramatic way (since the breakage would have nothing to do with precise
reasoning about resource usage).
Gábor Lehel wrote:
> It is true that (a) per Luke Palmer, if we could serialize equal
> functions to equal representations then we could we could decide
> whether two pure functions were equal, which (if not done in the IO
> monad) would(?) break purity; and (b) per Dan Doel, if we wanted to
> implement our serialization in a way so that equal functions get equal
> representations, we couldn't do it, because it's an impossible
> But these sort of cancel each other out, because (a) it's an
> impossible problem, and (b) we don't want to do it.
They do not cancel each other out. Rather, it goes like this:
a) Serializing functions gives you a decision procedure for function equality.
b) It is impossible to decide extensional equality of functions.
Together, these mean that any serialization procedure you define is *wrong*,
in that it violates the semantics of pure Haskell expressions (the resulting
decision procedure will answer 'False' for some pair of extensionally equal
expressions f and g).
The way IO can get around this is that when you have:
serialize (\x -> x + x :: Integer) :: IO String
serialize (\x -> 2 * x :: Integer) :: IO String
you can say that the two IO actions are equal in the following sense:
1) There is an equivalence class of expressions denoting the function in
question. This equivalence class contains (\x -> x + x) and
(\x -> 2 * x)
2) The IO action produced by serialize selects a string representation
for one of these expressions at random*.
*) You probably won't ever observe "\x -> x + x" being the output of
(serialize (\x -> 2 * x)), but that's just a coincidence.
This explanation is not available if serialize has type
(Integer -> Integer) -> String, however. Those have to be mathematical
functions (of a sort) that select a unique representation for each equivalence
class, or else we have to get a completely different (and much less nice)
denotational semantics for Haskell.
Perhaps I missed it, but: has anyone explained why:
(a -> b) -> IO String
is an unacceptable type? Why does it need to be just String? Folks often ask,
"what benefit is purity?" Well, what is the benefit of making this function
impure? GHC has a lot of normally impure functionality incorporated in a way
that is mathematically sound (though not ideal). What are the things that have
been shot down? Or does IO, ST, etc. qualify as getting shot down?
More information about the Haskell-Cafe