[Proposal] Strict `sum` and `product`

chessai chessai1996 at gmail.com
Tue Oct 27 01:16:09 UTC 2020


Emily, I already put up an MR:
https://gitlab.haskell.org/ghc/ghc/-/merge_requests/4355

On Mon, Oct 26, 2020, 20:14 Emily Pillmore <emilypi at cohomolo.gy> wrote:

> Just to keep this going so it doesn't die out. I'm going to request
> that Hécate submit a PR to address the composite criticism here,
> implementing `sum|product` in terms of `foldr`, and providing a strict
> equivalent `sum'|product'` variant for each as discussed. We'll move the
> discussion to that PR.
>
> Cheers,
> Emily
>
>
> On Thu, Oct 22, 2020 at 7:05 AM, Fumiaki Kinoshita <fumiexcel at gmail.com>
> wrote:
>
>> For a forced example, consider a wrapping of Double with a test
>>> for infinity on the left argument
>>>
>>>>>>  a + b | isInfinite a = a
>>>>>>
>>
>> This makes sense only if sum is foldr (+) 0, while the implementation of
>> sum for lists is foldl (+) 0. But hey, we are comparing a hypothetical type
>> with hundreds of real world uses.
>>
>> Haskell should remain a lazy language
>>>
>> I'm not sure if I agree. We've been bitten by laziness too much, hence
>> the introduction of BangPatterns, StrictData, deepseq, etc. IMO Haskell
>> ecosystem should be more strict.
>>
>> I'm uncomfortable with making foldMap/sum strict.
>>>
>>
>> No one is suggesting to change foldMap; what OP wants to change is sum
>> and product, and foldMap _happens to be_ the current default implementation.
>>
>> > we dont knowing the context. the best way to avoid the cost of
>>> computation is not to compute it; what alternatives are there
>>> i this s[ecific problem.
>>>
>>
>> The context is pretty clear; the status quo (sum = foldl (+) 0) is
>> flat-out wrong. I'd say more than 80% of the uses of sum is supposed to
>> look at all the elements. And even if (+) is short-circuiting in the way
>> Jon suggested, lazy foldl does not take any advantage of it.
>>
>> if one was to make foldMap/sum strict why not also other functions, e.g.
>>> scanAccum, where does one stop on the grounds of possibly saving costs in
>>> so as yet
>>> unknown application
>>>
>> I'm not sure which function you are talking about (there's no function
>> called scanAccum in base nor containers). If you mean scanl, lazy
>> accumulator still makes sense because the resulting list can be consumed
>> incrementally. foldl is not.
>>
>> Instead of thinking/programming  in a pure lazy language, you are then in
>>> the position of thinking lazily apart from some set of
>>> to be determined exceptional cases. the simplicity and uniformity of the
>>> model is gone.
>>>
>> Again, there's little to be gained from lazy foldl. It's avoided in
>> general and I'd say it's not "uniform" to use the function.
>>
>> 4. As per item 3 but from a pedagogical point of view. I taught UG  FP
>>> for years. Complicating the model is not going to help students,
>>> to say nothing of the caveats that would have to be added to
>>> existing teaching materials.
>>>
>> Also from a pedagogical point of view, a number of beginners fall into a
>> trap of lazy foldl. Caveats should be added to the current one if at all,
>> but this danger zone can be removed with one byte change.
>>
>> I'm seriously worried that the use of foldl in the standard library might
>> make beginners think that it's fine to use foldl.
>>
>> As far as I remember, the same change has been suggested a while ago but
>> didn't make it. I hope we don't get discouraged this time and go ahead with
>> the change.
>>
>> 2020年10月22日(木) 19:19 David Duke <duke.j.david at gmail.com>:
>>
>>> Concur with Jon's point.
>>> I'm uncomfortable with making foldMap/sum strict.
>>> Haskell was intended to explore a region of the programming design space
>>> that had not been well covered. and it remains
>>> if not unique  then at least exemplary in that respect. When  I write
>>> in Haskell I do so with the knowledge that I am working with
>>> lazy evaluation with the benefits and risk that comes with that. If I'm
>>> desperate to have finer operational control, I  either
>>> swallow that pill and adapt my use of Haskell accordingly, or pick up a
>>> different tool from the programming language toolbox.
>>>
>>> Now the OP pointed to an issue with one specific use of sum. However,
>>> that seems like weak grounds to make library changes:
>>> 1. we dont knowing the context. the best way to avoid the cost of
>>> computation is not to compute it; what alternatives are there
>>> i this s[ecific problem.
>>> 2. if one was to make foldMap/sum strict why not also other functions,
>>> e.g. scanAccum, where does one stop on the grounds of possibly saving costs
>>> in so as yet
>>> unknown application
>>> 3. Instead of thinking/programming  in a pure lazy language, you are
>>> then in the position of thinking lazily apart from some set of
>>> to be determined exceptional cases. the simplicity and uniformity of the
>>> model is gone.
>>> 4. As per item 3 but from a pedagogical point of view. I taught UG  FP
>>> for years. Complicating the model is not going to help students,
>>> to say nothing of the caveats that would have to be added to
>>> existing teaching materials.
>>> So whilst  the post was clearly intended to trigger discussion, in terms
>>> on conveying sentiment a strict  foldMap would be a big '-'
>>> for me personally. Whilst there are performance challenges in working
>>>  with Haskell. The primary role of the programming notation
>>> is allowing one to think about the problem and obtain a clear, correct
>>> solution. Performance tuning comes later if at all, and
>>> her are known techniques for doing so  with Haskell/ghc.
>>> 5. OTOH I see little   harm in adding a library that provides a strict
>>> foldMap for those
>>> who want to use it after  deliberate thought.
>>>
>>> YMMMocV
>>> David
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Oct 22, 2020 at 10:36 AM Jon Fairbairn <
>>> jon.fairbairn at cl.cam.ac.uk> wrote:
>>>
>>>> Sylvain Henry <sylvain at haskus.fr> writes:
>>>>
>>>> > Do we have an actual example of the use of a lazy sum in the
>>>> > wild?
>>>>
>>>> That’s not a very good argument, though I’ve seen it used quite
>>>> a lot.  The problem is that just because no-one can currently
>>>> think of a “real” example it doesn’t mean that there are no
>>>> cases where laziness is required.
>>>>
>>>> For a forced example, consider a wrapping of Double with a test
>>>> for infinity on the left argument
>>>>
>>>>>>>>  a + b | isInfinite a = a
>>>>>>>>
>>>> which doesn’t evaluate it’s second argument if the first is
>>>> infinite.
>>>>
>>>> > If the most common case is the strict one, making `sum`
>>>> > strict would be a better default.
>>>>
>>>> Haskell should remain a lazy language, so the defaults should be
>>>> lazy unless they can be proved strict anyway.
>>>>
>>>> > If need be we could also provide `lazySum` or something, but
>>>> > is there really a need?
>>>>
>>>> Who can truly say? I’d support sum' as the strict version as
>>>> that is consistent with other usages.
>>>>
>>>>   — Jón
>>>>
>>>> >
>>>> > Sylvain
>>>> >
>>>> >
>>>> > On 18/10/2020 22:16, Oleg Grenrus wrote:
>>>> >>
>>>> >> Sorry I wasn't clear myself. Option C is "add sum'"
>>>> >>
>>>> >> For lists:
>>>> >>
>>>> >>     sum  = foldr  (+) 0
>>>> >>     sum' = foldl' (+) 0
>>>> >>
>>>> >> For Foldable
>>>> >>
>>>> >>     sum  = getSum #. foldMap  Sum
>>>> >>     sum' = getSum #. foldMap' Sum
>>>> >>
>>>> >> - Oleg
>>>> >>
>>>> >> On 18.10.2020 23.11, Hécate wrote:
>>>> >>>
>>>> >>> Indeed, and I initially went to suggest a
>>>> >>> foldMap'`-based implementation to keep with the current
>>>> >>> implementation of many Foldable functions that are based
>>>> >>> on `foldMap` rather than a raw `fold`.
>>>> >>>
>>>> >>> On 18/10/2020 22:04, Oleg Grenrus wrote:
>>>> >>>> For the sake of bigger audience I didn't bother
>>>> >>>> mentioning #. which is a coercion helper. It's
>>>> >>>> essentially better (.) when the first argument is
>>>> >>>> newtype constructor (i.e. coerce).
>>>> >>>>
>>>> >>>> So with Option A (strict):
>>>> >>>>
>>>> >>>>     sum = getSum #. foldMap' Sum
>>>> >>>>
>>>> >>>> Or Option B (lazy)
>>>> >>>>
>>>> >>>>     sum = getSum #. foldMap Sum
>>>> >>>>
>>>> >>>> ---
>>>> >>>>
>>>> >>>> There is also third option, Option C:
>>>> >>>>
>>>> >>>>     sum  = foldr  (+) 0
>>>> >>>>     sum' = foldl' (+) 0
>>>> >>>>
>>>> >>>> I don't think this is worthwhile, but it is an option.
>>>> >>>> (to rehash, I don't consider maintaining status quo to
>>>> >>>> be an option at all).
>>>> >>>>
>>>> >>>> - Oleg
>>>> >>>> On 18.10.2020 22.54, Vanessa McHale wrote:
>>>> >>>>>
>>>> >>>>> It's
>>>> >>>>>
>>>> >>>>> sum = getSum #. foldMap Sum
>>>> >>>>>
>>>> >>>>> in base.
>>>> >>>>>
>>>> >>>>> On 10/18/20 2:49 PM, Oleg Grenrus wrote:
>>>> >>>>>>
>>>> >>>>>> The problem is the current definition of sum for lists
>>>> >>>>>> which uses foldl, i.e non-strict left fold
>>>> >>>>>>
>>>> >>>>>>     sum = foldl (+) 0
>>>> >>>>>>
>>>> >>>>>> It's utterly broken. Either we should change it to
>>>> >>>>>> foldl' to work on some types where addition is strict,
>>>> >>>>>> Option A:
>>>> >>>>>>
>>>> >>>>>>     sum = foldl' (+) 0
>>>> >>>>>>
>>>> >>>>>> or alternatively (to make people using lazy
>>>> >>>>>> accumulator types), Option B:
>>>> >>>>>>
>>>> >>>>>>     sum = foldr (+) 0
>>>> >>>>>>
>>>> >>>>>> The current state is no good for anyone or anything.
>>>> >>>>>>
>>>> >>>>>> ---
>>>> >>>>>>
>>>> >>>>>> Related issue which Hecate didn't clearly mention, is
>>>> >>>>>> that Foldable class default implementation has
>>>> >>>>>>
>>>> >>>>>>    class Foldable f where
>>>> >>>>>>        ...
>>>> >>>>>>        sum = getSum . foldMap Sum -- this is "good" lazy
>>>> definition
>>>> >>>>>>
>>>> >>>>>> If we select option A, then I argue that for
>>>> >>>>>> consistency the default `Foldable.sum` should be
>>>> >>>>>>
>>>> >>>>>>        sum = getSum . foldMap' Sum -- strict foldMap'
>>>> >>>>>>
>>>> >>>>>> If we select option B, Foldable definition doesn't need to be
>>>> changed.
>>>> >>>>>>
>>>> >>>>>> ---
>>>> >>>>>>
>>>> >>>>>> I repeat, using non-strict left fold, foldl, for sum
>>>> >>>>>> and product is not good for anything.
>>>> >>>>>> Either foldr or foldl'.
>>>> >>>>>>
>>>> >>>>>> I have no strong preference. Current state is unacceptable.
>>>> >>>>>>
>>>> >>>>>> -  Oleg
>>>> >>>>>>
>>>> >>>>>> On 18.10.2020 22.24, Henning Thielemann wrote:
>>>> >>>>>>>
>>>> >>>>>>> On Sun, 18 Oct 2020, Hécate wrote:
>>>> >>>>>>>
>>>> >>>>>>>> In conclusion, leaving things to the optimiser that
>>>> >>>>>>>> could be trivially made fast every time seems
>>>> >>>>>>>> needlessly risky.
>>>> >>>>>>>
>>>> >>>>>>> `seq` is still a hack. A strict 'sum' and 'product'
>>>> >>>>>>> would still fail on a lazy accumulator type, say a
>>>> >>>>>>> lazy pair type. If at all, sum and product should be
>>>> >>>>>>> deepseq-strict. So currently, letting the optimiser
>>>> >>>>>>> make a lazy sum strict is still the smaller hack.
>>>> >>>>>>>
>>>> >>>>>>> _______________________________________________
>>>> >>>>>>> Libraries mailing list
>>>> >>>>>>> Libraries at haskell.org
>>>> >>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>> >>>>>>
>>>> >>>>>> _______________________________________________
>>>> >>>>>> Libraries mailing list
>>>> >>>>>> Libraries at haskell.org
>>>> >>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>> >>>>>
>>>> >>>>> _______________________________________________
>>>> >>>>> Libraries mailing list
>>>> >>>>> Libraries at haskell.org
>>>> >>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>> >>>>
>>>> >>>> _______________________________________________
>>>> >>>> Libraries mailing list
>>>> >>>> Libraries at haskell.org
>>>> >>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>> >>> --
>>>> >>> Hécate [image: ✨]
>>>> >>> IRC: Uniaika
>>>> >>> WWW:https://glitchbra.in
>>>> >>> RUN: BSD
>>>> >>>
>>>> >>> _______________________________________________
>>>> >>> Libraries mailing list
>>>> >>> Libraries at haskell.org
>>>> >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>> >>
>>>> >> _______________________________________________
>>>> >> Libraries mailing list
>>>> >> Libraries at haskell.org
>>>> >> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>> > _______________________________________________
>>>> > Libraries mailing list
>>>> > Libraries at haskell.org
>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>>
>>>> --
>>>> Jón Fairbairn
>>>> Jon.Fairbairn at cl.cam.ac.uk
>>>> http://www.chaos.org.uk/~jf/Stuff-I-dont-want.html  (updated
>>>> 2014-04-05)
>>>>
>>>> _______________________________________________
>>>> Libraries mailing list
>>>> Libraries at haskell.org
>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>>
>>>
>>>
>>> --
>>> David Duke
>>> Emeritus Professor of Computer Science
>>> School of Computing University of Leeds UK
>>> E:duke.j.david at gmail.com
>>> W:https://engineering.leeds.ac.uk/staff/334/Professor_David_Duke
>>> _______________________________________________
>>> Libraries mailing list
>>> Libraries at haskell.org
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>>
>> _______________________________________________
>> Libraries mailing list
>> Libraries at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>
>
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20201026/41656d71/attachment.html>


More information about the Libraries mailing list