[Haskell-cafe] Eta Reduction

Edward Kmett ekmett at gmail.com
Tue Apr 1 14:32:27 UTC 2014


Check the date and consider the process necessary to "enumerate all Haskell
programs and check their types".


On Tue, Apr 1, 2014 at 9:17 AM, John Lato <jwlato at gmail.com> wrote:

> I think this is a great idea and should become a top priority. I would
> probably start by switching to a type-class-based seq, after which perhaps
> the next step forward would become more clear.
> John L.
> On Apr 1, 2014 2:54 AM, "Dan Doel" <dan.doel at gmail.com> wrote:
>> In the past year or two, there have been multiple performance problems in
>> various areas related to the fact that lambda abstraction is not free,
>> though we
>> tend to think of it as so. A major example of this was deriving of
>> Functor. If we
>> were to derive Functor for lists, we would end up with something like:
>>     instance Functor [] where
>>       fmap _ [] = []
>>       fmap f (x:xs) = f x : fmap (\y -> f y) xs
>> This definition is O(n^2) when fully evaluated,, because it causes O(n)
>> eta
>> expansions of f, so we spend time following indirections proportional to
>> the
>> depth of the element in the list. This has been fixed in 7.8, but there
>> are
>> other examples. I believe lens, [1] for instance, has some stuff in it
>> that
>> works very hard to avoid this sort of cost; and it's not always as easy
>> to avoid
>> as the above example. Composing with a newtype wrapper, for instance,
>> causes an
>> eta expansion that can only be seen as such at the core level.
>> The obvious solution is: do eta reduction. However, this is not
>> operationally
>> sound currently. The problem is that seq is capable of telling the
>> difference
>> between the following two expressions:
>>     undefined
>>     \x -> undefined x
>> The former causes seq to throw an exception, while the latter is
>> considered
>> defined enough to not do so. So, if we eta reduce, we can cause
>> terminating
>> programs to diverge if they make use of this feature.
>> Luckily, there is a solution.
>> Semantically one would usually identify the above two expressions. While
>> I do
>> believe one could construct a semantics that does distinguish them, it is
>> not
>> the usual practice. This suggests that there is a way to not distinguish
>> them,
>> perhaps even including seq. After all, the specification of seq is
>> monotone and
>> continuous regardless of whether we unify ⊥ with \x -> ⊥ x or insert an
>> extra
>> element for the latter.
>> The currently problematic case is function spaces, so I'll focus on it.
>> How
>> should:
>>     seq : (a -> b) -> c -> c
>> act? Well, other than an obvious bottom, we need to emit bottom whenever
>> our
>> given function is itself bottom at every input. This may first seem like a
>> problem, but it is actually quite simple. Without loss of generality, let
>> us
>> assume that we can enumerate the type a. Then, we can feed these values
>> to the
>> function, and check their results for bottom. Conal Elliot has prior art
>> for
>> this sort of thing with his unamb [2] package. For each value x :: a,
>> simply
>> compute 'f x `seq` ()' in parallel, and look for any successes. If we
>> ever find
>> one, we know the function is non-bottom, and we can return our value of
>> c. If we
>> never finish searching, then the function must be bottom, and seq should
>> not
>> terminate, so we have satisfied the specification.
>> Now, some may complain about the enumeration above. But this, too, is a
>> simple
>> matter. It is well known that Haskell programs are denumerable. So it is
>> quite
>> easy to enumerate all Haskell programs that produce a value, check
>> whether that
>> value has the type we're interested in, and compute said value. All of
>> this can
>> be done in Haskell. Thus, every Haskell type is programatically
>> enumerable in
>> Haskell, and we can use said enumeration in our implementation of seq for
>> function types. I have discussed this with Russell O'Connor [3], and he
>> assures
>> me that this argument should be sufficient even if we consider semantic
>> models
>> of Haskell that contain values not denoted by any Haskell program, so we
>> should
>> be safe there.
>> The one possible caveat is that this implementation of seq is not
>> operationally
>> uniform across all types, so the current fully polymorphic type of seq
>> may not
>> make sense. But moving back to a type class based approach isn't so bad,
>> and
>> this time we will have a semantically sound backing, instead of just
>> having a
>> class with the equivalent of the current magic function in it.
>> Once this machinery is in place, we can eta reduce to our hearts'
>> content, and
>> not have to worry about breaking semantics. We'd no longer put the burden
>> on
>> programmers to use potentially unsafe hacks to avoid eta expansions. I
>> apologize
>> for not sketching an implementation of the above algorithm, but I'm sure
>> it
>> should be elementary enough to make it into GHC in the next couple
>> versions.
>> Everyone learns about this type of thing in university computer science
>> programs, no?
>> Thoughts? Comments? Questions?
>> Cheers,
>> -- Dan
>> [1] http://hackage.haskell.org/package/lens
>> [2] http://hackage.haskell.org/package/unamb
>> [3] http://r6.ca/
>> _______________________________________________
>> Glasgow-haskell-users mailing list
>> Glasgow-haskell-users at haskell.org
>> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/glasgow-haskell-users/attachments/20140401/f36a0085/attachment.html>

More information about the Glasgow-haskell-users mailing list