[Proposal] Strict `sum` and `product`

Emily Pillmore emilypi at cohomolo.gy
Tue Oct 27 01:45:08 UTC 2020


Nice, thanks Chessai

On Mon, Oct 26, 2020 at 9:16 PM, chessai < chessai1996 at gmail.com > wrote:

> 
> Emily, I already put up an MR: https:/ / gitlab. haskell. org/ ghc/ ghc/ -/
> merge_requests/ 4355 (
> https://gitlab.haskell.org/ghc/ghc/-/merge_requests/4355 )
> 
> On Mon, Oct 26, 2020, 20:14 Emily Pillmore < emilypi@ cohomolo. gy (
> emilypi at cohomolo.gy ) > wrote:
> 
> 
>> Just to keep this going so it doesn't die out. I'm going to request that
>> Hécate submit a PR to address the composite criticism here, implementing
>> `sum|product` in terms of `foldr`, and providing a strict equivalent
>> `sum'|product'` variant for each as discussed. We'll move the discussion
>> to that PR.
>> 
>> 
>> 
>> Cheers,
>> 
>> Emily
>> 
>> 
>> 
>> 
>> On Thu, Oct 22, 2020 at 7:05 AM, Fumiaki Kinoshita < fumiexcel@ gmail. com
>> ( fumiexcel at gmail.com ) > wrote:
>> 
>>> 
>>>> For a forced example, consider a wrapping of Double with a test
>>>> for infinity on the left argument
>>>> 
>>>>>>>> a + b | isInfinite a = a
>>>>>>>> 
>>> 
>>> 
>>> 
>>> This makes sense only if sum is foldr (+) 0, while the implementation of
>>> sum for lists is foldl (+) 0. But hey, we are comparing a hypothetical
>>> type with hundreds of real world uses.
>>> 
>>> 
>>> 
>>> 
>>>> Haskell should remain a lazy language
>>>> 
>>> 
>>> I'm not sure if I agree. We've been bitten by laziness too much, hence the
>>> introduction of BangPatterns, StrictData, deepseq, etc. IMO Haskell
>>> ecosystem should be more strict.
>>> 
>>> 
>>> 
>>>> I'm uncomfortable with making foldMap/sum strict.
>>>> 
>>> 
>>> 
>>> 
>>> No one is suggesting to change foldMap; what OP wants to change is sum and
>>> product, and foldMap _happens to be_ the current default implementation.
>>> 
>>> 
>>> 
>>>> > we dont knowing the context. the best way to avoid the cost of
>>>> computation is not to compute it; what alternatives are there
>>>> i this s[ecific problem.
>>>> 
>>> 
>>> 
>>> 
>>> The context is pretty clear; the status quo (sum = foldl (+) 0) is
>>> flat-out wrong. I'd say more than 80% of the uses of sum is supposed to
>>> look at all the elements. And even if (+) is short-circuiting in the way
>>> Jon suggested, lazy foldl does not take any advantage of it.
>>> 
>>> 
>>> 
>>>> if one was to make foldMap/sum strict why not also other functions, e.g.
>>>> scanAccum, where does one stop on the grounds of possibly saving costs in
>>>> so as yet
>>>> unknown application
>>>> 
>>> 
>>> I'm not sure which function you are talking about (there's no function
>>> called scanAccum in base nor containers). If you mean scanl, lazy
>>> accumulator still makes sense because the resulting list can be consumed
>>> incrementally. foldl is not.
>>> 
>>> 
>>> 
>>> 
>>>> Instead of thinking/programming  in a pure lazy language, you are then in
>>>> the position of thinking lazily apart from some set of
>>>> to be determined exceptional cases. the simplicity and uniformity of the
>>>> model is gone.
>>>> 
>>> 
>>> Again, there's little to be gained from lazy foldl. It's avoided in
>>> general and I'd say it's not "uniform" to use the function.
>>> 
>>> 
>>> 
>>>> 4. As per item 3 but from a pedagogical point of view. I taught UG  FP for
>>>> years. Complicating the model is not going to help students,
>>>> to say nothing of the caveats that would have to be added to existing
>>>> teaching materials.
>>>> 
>>> 
>>> Also from a pedagogical point of view, a number of beginners fall into a
>>> trap of lazy foldl. Caveats should be added to the current one if at all,
>>> but this danger zone can be removed with one byte change.
>>> 
>>> 
>>> I'm seriously worried that the use of foldl in the standard library might
>>> make beginners think that it's fine to use foldl.
>>> 
>>> 
>>> As far as I remember, the same change has been suggested a while ago but
>>> didn't make it. I hope we don't get discouraged this time and go ahead
>>> with the change.
>>> 
>>> 
>>> 
>>> 2020年10月22日(木) 19:19 David Duke < duke. j. david@ gmail. com (
>>> duke.j.david at gmail.com ) >:
>>> 
>>> 
>>>> Concur with Jon's point.
>>>> I'm uncomfortable with making foldMap/sum strict.
>>>> Haskell was intended to explore a region of the programming design space
>>>> that had not been well covered. and it remains
>>>> if not unique  then at least exemplary in that respect. When  I write in
>>>> Haskell I do so with the knowledge that I am working with
>>>> lazy evaluation with the benefits and risk that comes with that. If I'm
>>>> desperate to have finer operational control, I  either
>>>> swallow that pill and adapt my use of Haskell accordingly, or pick up a
>>>> different tool from the programming language toolbox.
>>>> 
>>>> 
>>>> Now the OP pointed to an issue with one specific use of sum. However, that
>>>> seems like weak grounds to make library changes:
>>>> 1. we dont knowing the context. the best way to avoid the cost of
>>>> computation is not to compute it; what alternatives are there
>>>> i this s[ecific problem.
>>>> 2. if one was to make foldMap/sum strict why not also other functions,
>>>> e.g. scanAccum, where does one stop on the grounds of possibly saving
>>>> costs in so as yet
>>>> unknown application
>>>> 3. Instead of thinking/programming  in a pure lazy language, you are then
>>>> in the position of thinking lazily apart from some set of
>>>> to be determined exceptional cases. the simplicity and uniformity of the
>>>> model is gone.
>>>> 4. As per item 3 but from a pedagogical point of view. I taught UG  FP for
>>>> years. Complicating the model is not going to help students,
>>>> to say nothing of the caveats that would have to be added to existing
>>>> teaching materials.
>>>> So whilst  the post was clearly intended to trigger discussion, in terms
>>>> on conveying sentiment a strict  foldMap would be a big '-'
>>>> for me personally. Whilst there are performance challenges in working 
>>>> with Haskell. The primary role of the programming notation
>>>> is allowing one to think about the problem and obtain a clear, correct
>>>> solution. Performance tuning comes later if at all, and
>>>> her are known techniques for doing so  with Haskell/ghc.
>>>> 5. OTOH I see little   harm in adding a library that provides a strict
>>>> foldMap for those
>>>> who want to use it after  deliberate thought.
>>>> 
>>>> 
>>>> YMMMocV
>>>> David
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Thu, Oct 22, 2020 at 10:36 AM Jon Fairbairn < jon. fairbairn@ cl. cam. ac.
>>>> uk ( jon.fairbairn at cl.cam.ac.uk ) > wrote:
>>>> 
>>>> 
>>>>> Sylvain Henry < sylvain@ haskus. fr ( sylvain at haskus.fr ) > writes:
>>>>> 
>>>>> > Do we have an actual example of the use of a lazy sum in the
>>>>> > wild?
>>>>> 
>>>>> That’s not a very good argument, though I’ve seen it used quite
>>>>> a lot.  The problem is that just because no-one can currently
>>>>> think of a “real” example it doesn’t mean that there are no
>>>>> cases where laziness is required.
>>>>> 
>>>>> For a forced example, consider a wrapping of Double with a test
>>>>> for infinity on the left argument
>>>>> 
>>>>>>>>>> a + b | isInfinite a = a
>>>>>>>>>> 
>>>>> which doesn’t evaluate it’s second argument if the first is
>>>>> infinite.
>>>>> 
>>>>> > If the most common case is the strict one, making `sum`
>>>>> > strict would be a better default.
>>>>> 
>>>>> Haskell should remain a lazy language, so the defaults should be
>>>>> lazy unless they can be proved strict anyway.
>>>>> 
>>>>> > If need be we could also provide `lazySum` or something, but
>>>>> > is there really a need?
>>>>> 
>>>>> Who can truly say? I’d support sum' as the strict version as
>>>>> that is consistent with other usages.
>>>>> 
>>>>> — Jón
>>>>> 
>>>>> >
>>>>> > Sylvain
>>>>> >
>>>>> >
>>>>> > On 18/10/2020 22:16, Oleg Grenrus wrote:
>>>>> >>
>>>>> >> Sorry I wasn't clear myself. Option C is "add sum'"
>>>>> >>
>>>>> >> For lists:
>>>>> >>
>>>>> >>     sum  = foldr  (+) 0
>>>>> >>     sum' = foldl' (+) 0
>>>>> >>
>>>>> >> For Foldable
>>>>> >>
>>>>> >>     sum  = getSum #. foldMap  Sum
>>>>> >>     sum' = getSum #. foldMap' Sum
>>>>> >>
>>>>> >> - Oleg
>>>>> >>
>>>>> >> On 18.10.2020 23.11, Hécate wrote:
>>>>> >>>
>>>>> >>> Indeed, and I initially went to suggest a
>>>>> >>> foldMap'`-based implementation to keep with the current
>>>>> >>> implementation of many Foldable functions that are based
>>>>> >>> on `foldMap` rather than a raw `fold`.
>>>>> >>>
>>>>> >>> On 18/10/2020 22:04, Oleg Grenrus wrote:
>>>>> >>>> For the sake of bigger audience I didn't bother
>>>>> >>>> mentioning #. which is a coercion helper. It's
>>>>> >>>> essentially better (.) when the first argument is
>>>>> >>>> newtype constructor (i.e. coerce).
>>>>> >>>>
>>>>> >>>> So with Option A (strict):
>>>>> >>>>
>>>>> >>>>     sum = getSum #. foldMap' Sum
>>>>> >>>>
>>>>> >>>> Or Option B (lazy)
>>>>> >>>>
>>>>> >>>>     sum = getSum #. foldMap Sum
>>>>> >>>>
>>>>> >>>> ---
>>>>> >>>>
>>>>> >>>> There is also third option, Option C:
>>>>> >>>>
>>>>> >>>>     sum  = foldr  (+) 0
>>>>> >>>>     sum' = foldl' (+) 0
>>>>> >>>>
>>>>> >>>> I don't think this is worthwhile, but it is an option.
>>>>> >>>> (to rehash, I don't consider maintaining status quo to
>>>>> >>>> be an option at all).
>>>>> >>>>
>>>>> >>>> - Oleg
>>>>> >>>> On 18.10.2020 22.54, Vanessa McHale wrote:
>>>>> >>>>>
>>>>> >>>>> It's
>>>>> >>>>>
>>>>> >>>>> sum = getSum #. foldMap Sum
>>>>> >>>>>
>>>>> >>>>> in base.
>>>>> >>>>>
>>>>> >>>>> On 10/18/20 2:49 PM, Oleg Grenrus wrote:
>>>>> >>>>>>
>>>>> >>>>>> The problem is the current definition of sum for lists
>>>>> >>>>>> which uses foldl, i.e non-strict left fold
>>>>> >>>>>>
>>>>> >>>>>>     sum = foldl (+) 0
>>>>> >>>>>>
>>>>> >>>>>> It's utterly broken. Either we should change it to
>>>>> >>>>>> foldl' to work on some types where addition is strict,
>>>>> >>>>>> Option A:
>>>>> >>>>>>
>>>>> >>>>>>     sum = foldl' (+) 0
>>>>> >>>>>>
>>>>> >>>>>> or alternatively (to make people using lazy
>>>>> >>>>>> accumulator types), Option B:
>>>>> >>>>>>
>>>>> >>>>>>     sum = foldr (+) 0
>>>>> >>>>>>
>>>>> >>>>>> The current state is no good for anyone or anything.
>>>>> >>>>>>
>>>>> >>>>>> ---
>>>>> >>>>>>
>>>>> >>>>>> Related issue which Hecate didn't clearly mention, is
>>>>> >>>>>> that Foldable class default implementation has
>>>>> >>>>>>
>>>>> >>>>>>    class Foldable f where
>>>>> >>>>>>        ...
>>>>> >>>>>>        sum = getSum . foldMap Sum -- this is "good" lazy definition
>>>>> 
>>>>> >>>>>>
>>>>> >>>>>> If we select option A, then I argue that for
>>>>> >>>>>> consistency the default `Foldable.sum` should be
>>>>> >>>>>>
>>>>> >>>>>>        sum = getSum . foldMap' Sum -- strict foldMap'
>>>>> >>>>>>
>>>>> >>>>>> If we select option B, Foldable definition doesn't need to be
>>>>> changed.
>>>>> >>>>>>
>>>>> >>>>>> ---
>>>>> >>>>>>
>>>>> >>>>>> I repeat, using non-strict left fold, foldl, for sum
>>>>> >>>>>> and product is not good for anything.
>>>>> >>>>>> Either foldr or foldl'.
>>>>> >>>>>>
>>>>> >>>>>> I have no strong preference. Current state is unacceptable.
>>>>> >>>>>>
>>>>> >>>>>> -  Oleg
>>>>> >>>>>>
>>>>> >>>>>> On 18.10.2020 22.24, Henning Thielemann wrote:
>>>>> >>>>>>>
>>>>> >>>>>>> On Sun, 18 Oct 2020, Hécate wrote:
>>>>> >>>>>>>
>>>>> >>>>>>>> In conclusion, leaving things to the optimiser that
>>>>> >>>>>>>> could be trivially made fast every time seems
>>>>> >>>>>>>> needlessly risky.
>>>>> >>>>>>>
>>>>> >>>>>>> `seq` is still a hack. A strict 'sum' and 'product'
>>>>> >>>>>>> would still fail on a lazy accumulator type, say a
>>>>> >>>>>>> lazy pair type. If at all, sum and product should be
>>>>> >>>>>>> deepseq-strict. So currently, letting the optimiser
>>>>> >>>>>>> make a lazy sum strict is still the smaller hack.
>>>>> >>>>>>>
>>>>> >>>>>>> _______________________________________________
>>>>> >>>>>>> Libraries mailing list
>>>>> >>>>>>> Libraries@ haskell. org ( Libraries at haskell.org )
>>>>> >>>>>>> http:/ / mail. haskell. org/ cgi-bin/ mailman/ listinfo/ libraries
>>>>> ( http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries )
>>>>> >>>>>>
>>>>> >>>>>> _______________________________________________
>>>>> >>>>>> Libraries mailing list
>>>>> >>>>>> Libraries@ haskell. org ( Libraries at haskell.org )
>>>>> >>>>>> http:/ / mail. haskell. org/ cgi-bin/ mailman/ listinfo/ libraries (
>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries )
>>>>> >>>>>
>>>>> >>>>> _______________________________________________
>>>>> >>>>> Libraries mailing list
>>>>> >>>>> Libraries@ haskell. org ( Libraries at haskell.org )
>>>>> >>>>> http:/ / mail. haskell. org/ cgi-bin/ mailman/ listinfo/ libraries (
>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries )
>>>>> >>>>
>>>>> >>>> _______________________________________________
>>>>> >>>> Libraries mailing list
>>>>> >>>> Libraries@ haskell. org ( Libraries at haskell.org )
>>>>> >>>> http:/ / mail. haskell. org/ cgi-bin/ mailman/ listinfo/ libraries (
>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries )
>>>>> >>> --
>>>>> >>> Hécate ✨
>>>>> >>> IRC: Uniaika
>>>>> >>> WWW: https:/ / glitchbra. in ( https://glitchbra.in )
>>>>> >>> RUN: BSD
>>>>> >>>
>>>>> >>> _______________________________________________
>>>>> >>> Libraries mailing list
>>>>> >>> Libraries@ haskell. org ( Libraries at haskell.org )
>>>>> >>> http:/ / mail. haskell. org/ cgi-bin/ mailman/ listinfo/ libraries (
>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries )
>>>>> >>
>>>>> >> _______________________________________________
>>>>> >> Libraries mailing list
>>>>> >> Libraries@ haskell. org ( Libraries at haskell.org )
>>>>> >> http:/ / mail. haskell. org/ cgi-bin/ mailman/ listinfo/ libraries (
>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries )
>>>>> > _______________________________________________
>>>>> > Libraries mailing list
>>>>> > Libraries@ haskell. org ( Libraries at haskell.org )
>>>>> > http:/ / mail. haskell. org/ cgi-bin/ mailman/ listinfo/ libraries (
>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries )
>>>>> 
>>>>> --
>>>>> Jón Fairbairn Jon. Fairbairn@ cl. cam. ac. uk ( Jon.Fairbairn at cl.cam.ac.uk
>>>>> )
>>>>> http:/ / www. chaos. org. uk/ ~jf/ Stuff-I-dont-want. html (
>>>>> http://www.chaos.org.uk/~jf/Stuff-I-dont-want.html ) (updated 2014-04-05)
>>>>> 
>>>>> _______________________________________________
>>>>> Libraries mailing list
>>>>> Libraries@ haskell. org ( Libraries at haskell.org )
>>>>> http:/ / mail. haskell. org/ cgi-bin/ mailman/ listinfo/ libraries (
>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries )
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> David Duke
>>>> Emeritus Professor of Computer Science
>>>> School of Computing University of Leeds UK
>>>> E:duke. j. david@ gmail. com ( E%3Aduke.j.david at gmail.com )
>>>> W: https:/ / engineering. leeds. ac. uk/ staff/ 334/ Professor_David_Duke (
>>>> https://engineering.leeds.ac.uk/staff/334/Professor_David_Duke )
>>>> _______________________________________________
>>>> Libraries mailing list
>>>> Libraries@ haskell. org ( Libraries at haskell.org )
>>>> http:/ / mail. haskell. org/ cgi-bin/ mailman/ listinfo/ libraries (
>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries )
>>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Libraries mailing list
>>> Libraries@ haskell. org ( Libraries at haskell.org )
>>> http:/ / mail. haskell. org/ cgi-bin/ mailman/ listinfo/ libraries (
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries )
>>> 
>>> 
>>> 
>> 
>> 
>> 
>> _______________________________________________
>> Libraries mailing list
>> Libraries@ haskell. org ( Libraries at haskell.org )
>> http:/ / mail. haskell. org/ cgi-bin/ mailman/ listinfo/ libraries (
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries )
> 
> 
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20201027/3e95011e/attachment-0001.html>


More information about the Libraries mailing list