Drastic Prelude changes imminent
Edward Kmett
ekmett at gmail.com
Sun Feb 1 20:56:10 UTC 2015
On Sun, Feb 1, 2015 at 2:59 PM, Iavor Diatchki <iavor.diatchki at gmail.com>
wrote:
> Hello,
>
> I think that "dead end #3" is the right choice, and keep Foldable etc. in
> their own modules. We have a module system, we should use it. In
> general, I think if we are to "burn bridges" it should be by moving things
> out of the Prelude, not adding more abstractions there.
>
> It'd strike me as incredibly arbitrary to wind up with some weird half-way
point where mapM and traverse generalize but where _obviously_ related
combinators such as mapM_ and traverse_ don't.
For the sake of concreteness, here are some of the things I don't like
> about the current design:
>
> (e.g., why are 'sum' and 'product' there??).
>
We have a number of competing concerns to balance in Foldable.
1.) We absolutely cannot have rampant performance regressions.
2.) Existing code should continue to as large an extent as much possible.
3.) We shouldn't randomly change the semantics of existing code.
4.) We want a minimal class.
Removing `sum` and `product` from the class serves desire #4 perfectly.
However, it incurs problems for desire #1 and #3.
There are really 3 camps about how Data.Foldable.sum should be implemented.
Camp 1)
sum = foldl (+) 0
is the implementation that is in the Haskell Report.
This implementation is actually pretty much maximally bad for all usecases.
It has uses foldl so it leaks all sorts of memory while building up the
structure that it finally folds.
Camp 2)
sum = foldl' (+) 0
is the implementation that folks like Duncan would rather see.
This kills most of the space and performance problems with the standard
implementation, but switching could introduce bottoms in existing programs.
Camp 3)
sum = getSum . foldMap sum
is the implementation that ensures that it doesn't destroy the asymptotics
of the number of uses of 'mappend' in foldMap.
The right container can readily fold 2^20th a's with 20 mappends.
The best the camp 2 solution can do is still 1 million steps, but at least
those steps aren't randomly leaking space.
The downside of this is that this implementation can use more stack in the
cases where ghc was smart enough to monomorphize the original Prelude
solution to a number type where it could figure out via strictness that it
could do an efficient left fold.
Similarly consider maximum from the Report:
maximum, minimum :: (Ord a) => [a] -> a
maximum [] = error "Prelude.maximum: empty list"
maximum xs = foldl1 max xs
vs. maximum from the Data.Foldable:
-- | The largest element of a non-empty structure.
maximum :: (Foldable t, Ord a) => t a -> a
maximum = foldr1 max
There are two concerns that caused us to decide to include these as members
in the class as part of 7.10.
* Picking one *silently changes the semantics of existing programs* with
consequences that are hard to predict.
* A separate libraries@ proposal was filed by David Feuer and Reid Barton
was filed *and passed* to add a number of additional members to
Data.Foldable for yet another concern, which is to allow these operations
to work with better asymptotics for many containers.
Foldable only ever consumes 'f' in negative position, it never produces an
'f', which means that unlike every other class we have that works
parametrically over something of kind * -> *, we can know something about
'a'. This enables folks to construct containers where many of these
operations are O(log n) or O(1).
Included, perhaps gratuitously, in that proposal were the additional
generalizations for null and length.
Those who proposed these extensions had a desire to allow users to improve
the asymptotics of these operations, and nobody who is now declaiming the
size of the class objected back in September when the proposal went through
on the mailing list to a number of enthusiastic +1's and a couple of
lukewarm dislikes.
Personally I'm on the fence about the latter set of additions that exist
purely to improve the asymptotics of some operations over and above what
you should be able to get by computing directly with a given choice of
monoid, but I'm also not particularly fond of saying that what we're going
to do with everything that passes through the proposal process is wait 3
months until the midnight hour before a release and then recolor the
bikeshed.
> - the state of Data.List: I generally prefer that, when possible, a
> datatype should provide non-overloaded access to its functionality, and in
> addition, there can be instances to expose the same functionality via an
> overloaded API.
>
Personally, I'd just as soon remove these members from this module
entirely. They act as an active barrier against proposals such as Neil and
Lennart's that for changes like this -- or much more practically, for other
changes in the future -- that we consider exposing an alternate Prelude.
Data.List derives most of its members as a renaming of PreludeList from the
standard report.
The problem is the *vast majority* of the users of Data.List, do so by
importing it unqualified simply to get 'sort'.
They aren't actively seeking to introduce a conflicting import of foldr!
The benefit of offering a monomorphic version of these combinators in there
compares badly against the level of code breakage that it would introduce.
At an earlier stage in the design we had a Data.OldList module, which
contained the original signatures, but having it required us to tie the
import graph in base in knots.
Reintroducing such a module is definitely a viable option, the pain
involved in adopting is mostly borne internally by base.
> Of course, I can generally work around all of those, so not a big deal.
> However, it'd be nice if we could come up with a design that, more or
> less, the whole community thinks is a good one.
>
The very reason why a core libraries committee was formed was because there
was a great deal of frustration with the inability to make headway on these
sorts of issues by universal consent.
We had a few cases like Ian unilaterally removing Eq and Show as a
superclass of Num, but really it was just "who had commit access" that
determined what could happen, and nobody who felt like they had the
responsibility to ensure the stability of base was balanced against the
increasing pressures on it from different fronts.
As the Haskell community has grown larger and larger getting every one of
thousands of contributors to the ecosystem to agree on every fine detail
when you have issues where you have a very large design surface has become
progressively more and more improbable.
Unanimity is a laudable goal, but often unattainable. That said, I very
much believe in establishing as broad of a consensus as possible.
-Edward
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/libraries/attachments/20150201/fb807f8c/attachment.html>
More information about the Libraries
mailing list