[Proposal] Strict `sum` and `product`

Emily Pillmore emilypi at cohomolo.gy
Tue Oct 27 01:12:56 UTC 2020


Just to keep this going so it doesn't die out. I'm going to request that Hécate submit a PR to address the composite criticism here, implementing `sum|product` in terms of `foldr`, and providing a strict equivalent `sum'|product'` variant for each as discussed. We'll move the discussion to that PR.

Cheers,

Emily

On Thu, Oct 22, 2020 at 7:05 AM, Fumiaki Kinoshita < fumiexcel at gmail.com > wrote:

> 
> 
>> For a forced example, consider a wrapping of Double with a test
>> for infinity on the left argument
>> 
>>>> a + b | isInfinite a = a
>>>> 
> 
> 
> 
> This makes sense only if sum is foldr (+) 0, while the implementation of
> sum for lists is foldl (+) 0. But hey, we are comparing a hypothetical
> type with hundreds of real world uses.
> 
> 
> 
> 
>> Haskell should remain a lazy language
>> 
> 
> I'm not sure if I agree. We've been bitten by laziness too much, hence the
> introduction of BangPatterns, StrictData, deepseq, etc. IMO Haskell
> ecosystem should be more strict.
> 
> 
> 
>> I'm uncomfortable with making foldMap/sum strict.
>> 
> 
> 
> 
> No one is suggesting to change foldMap; what OP wants to change is sum and
> product, and foldMap _happens to be_ the current default implementation.
> 
> 
> 
>> > we dont knowing the context. the best way to avoid the cost of
>> computation is not to compute it; what alternatives are there
>> i this s[ecific problem.
>> 
> 
> 
> 
> The context is pretty clear; the status quo (sum = foldl (+) 0) is
> flat-out wrong. I'd say more than 80% of the uses of sum is supposed to
> look at all the elements. And even if (+) is short-circuiting in the way
> Jon suggested, lazy foldl does not take any advantage of it.
> 
> 
> 
>> if one was to make foldMap/sum strict why not also other functions, e.g.
>> scanAccum, where does one stop on the grounds of possibly saving costs in
>> so as yet
>> unknown application
>> 
> 
> I'm not sure which function you are talking about (there's no function
> called scanAccum in base nor containers). If you mean scanl, lazy
> accumulator still makes sense because the resulting list can be consumed
> incrementally. foldl is not.
> 
> 
> 
> 
>> Instead of thinking/programming  in a pure lazy language, you are then in
>> the position of thinking lazily apart from some set of
>> to be determined exceptional cases. the simplicity and uniformity of the
>> model is gone.
>> 
> 
> Again, there's little to be gained from lazy foldl. It's avoided in
> general and I'd say it's not "uniform" to use the function.
> 
> 
> 
>> 4. As per item 3 but from a pedagogical point of view. I taught UG  FP for
>> years. Complicating the model is not going to help students,
>> to say nothing of the caveats that would have to be added to existing
>> teaching materials.
>> 
> 
> Also from a pedagogical point of view, a number of beginners fall into a
> trap of lazy foldl. Caveats should be added to the current one if at all,
> but this danger zone can be removed with one byte change.
> 
> 
> I'm seriously worried that the use of foldl in the standard library might
> make beginners think that it's fine to use foldl.
> 
> 
> As far as I remember, the same change has been suggested a while ago but
> didn't make it. I hope we don't get discouraged this time and go ahead
> with the change.
> 
> 
> 
> 2020年10月22日(木) 19:19 David Duke < duke. j. david@ gmail. com (
> duke.j.david at gmail.com ) >:
> 
> 
>> Concur with Jon's point.
>> I'm uncomfortable with making foldMap/sum strict.
>> Haskell was intended to explore a region of the programming design space
>> that had not been well covered. and it remains
>> if not unique  then at least exemplary in that respect. When  I write in
>> Haskell I do so with the knowledge that I am working with
>> lazy evaluation with the benefits and risk that comes with that. If I'm
>> desperate to have finer operational control, I  either
>> swallow that pill and adapt my use of Haskell accordingly, or pick up a
>> different tool from the programming language toolbox.
>> 
>> 
>> Now the OP pointed to an issue with one specific use of sum. However, that
>> seems like weak grounds to make library changes:
>> 1. we dont knowing the context. the best way to avoid the cost of
>> computation is not to compute it; what alternatives are there
>> i this s[ecific problem.
>> 2. if one was to make foldMap/sum strict why not also other functions,
>> e.g. scanAccum, where does one stop on the grounds of possibly saving
>> costs in so as yet
>> unknown application
>> 3. Instead of thinking/programming  in a pure lazy language, you are then
>> in the position of thinking lazily apart from some set of
>> to be determined exceptional cases. the simplicity and uniformity of the
>> model is gone.
>> 4. As per item 3 but from a pedagogical point of view. I taught UG  FP for
>> years. Complicating the model is not going to help students,
>> to say nothing of the caveats that would have to be added to existing
>> teaching materials.
>> So whilst  the post was clearly intended to trigger discussion, in terms
>> on conveying sentiment a strict  foldMap would be a big '-'
>> for me personally. Whilst there are performance challenges in working 
>> with Haskell. The primary role of the programming notation
>> is allowing one to think about the problem and obtain a clear, correct
>> solution. Performance tuning comes later if at all, and
>> her are known techniques for doing so  with Haskell/ghc.
>> 5. OTOH I see little   harm in adding a library that provides a strict
>> foldMap for those
>> who want to use it after  deliberate thought.
>> 
>> 
>> YMMMocV
>> David
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> On Thu, Oct 22, 2020 at 10:36 AM Jon Fairbairn < jon. fairbairn@ cl. cam. ac.
>> uk ( jon.fairbairn at cl.cam.ac.uk ) > wrote:
>> 
>> 
>>> Sylvain Henry < sylvain@ haskus. fr ( sylvain at haskus.fr ) > writes:
>>> 
>>> > Do we have an actual example of the use of a lazy sum in the
>>> > wild?
>>> 
>>> That’s not a very good argument, though I’ve seen it used quite
>>> a lot.  The problem is that just because no-one can currently
>>> think of a “real” example it doesn’t mean that there are no
>>> cases where laziness is required.
>>> 
>>> For a forced example, consider a wrapping of Double with a test
>>> for infinity on the left argument
>>> 
>>>>>> a + b | isInfinite a = a
>>>>>> 
>>> which doesn’t evaluate it’s second argument if the first is
>>> infinite.
>>> 
>>> > If the most common case is the strict one, making `sum`
>>> > strict would be a better default.
>>> 
>>> Haskell should remain a lazy language, so the defaults should be
>>> lazy unless they can be proved strict anyway.
>>> 
>>> > If need be we could also provide `lazySum` or something, but
>>> > is there really a need?
>>> 
>>> Who can truly say? I’d support sum' as the strict version as
>>> that is consistent with other usages.
>>> 
>>> — Jón
>>> 
>>> >
>>> > Sylvain
>>> >
>>> >
>>> > On 18/10/2020 22:16, Oleg Grenrus wrote:
>>> >>
>>> >> Sorry I wasn't clear myself. Option C is "add sum'"
>>> >>
>>> >> For lists:
>>> >>
>>> >>     sum  = foldr  (+) 0
>>> >>     sum' = foldl' (+) 0
>>> >>
>>> >> For Foldable
>>> >>
>>> >>     sum  = getSum #. foldMap  Sum
>>> >>     sum' = getSum #. foldMap' Sum
>>> >>
>>> >> - Oleg
>>> >>
>>> >> On 18.10.2020 23.11, Hécate wrote:
>>> >>>
>>> >>> Indeed, and I initially went to suggest a
>>> >>> foldMap'`-based implementation to keep with the current
>>> >>> implementation of many Foldable functions that are based
>>> >>> on `foldMap` rather than a raw `fold`.
>>> >>>
>>> >>> On 18/10/2020 22:04, Oleg Grenrus wrote:
>>> >>>> For the sake of bigger audience I didn't bother
>>> >>>> mentioning #. which is a coercion helper. It's
>>> >>>> essentially better (.) when the first argument is
>>> >>>> newtype constructor (i.e. coerce).
>>> >>>>
>>> >>>> So with Option A (strict):
>>> >>>>
>>> >>>>     sum = getSum #. foldMap' Sum
>>> >>>>
>>> >>>> Or Option B (lazy)
>>> >>>>
>>> >>>>     sum = getSum #. foldMap Sum
>>> >>>>
>>> >>>> ---
>>> >>>>
>>> >>>> There is also third option, Option C:
>>> >>>>
>>> >>>>     sum  = foldr  (+) 0
>>> >>>>     sum' = foldl' (+) 0
>>> >>>>
>>> >>>> I don't think this is worthwhile, but it is an option.
>>> >>>> (to rehash, I don't consider maintaining status quo to
>>> >>>> be an option at all).
>>> >>>>
>>> >>>> - Oleg
>>> >>>> On 18.10.2020 22.54, Vanessa McHale wrote:
>>> >>>>>
>>> >>>>> It's
>>> >>>>>
>>> >>>>> sum = getSum #. foldMap Sum
>>> >>>>>
>>> >>>>> in base.
>>> >>>>>
>>> >>>>> On 10/18/20 2:49 PM, Oleg Grenrus wrote:
>>> >>>>>>
>>> >>>>>> The problem is the current definition of sum for lists
>>> >>>>>> which uses foldl, i.e non-strict left fold
>>> >>>>>>
>>> >>>>>>     sum = foldl (+) 0
>>> >>>>>>
>>> >>>>>> It's utterly broken. Either we should change it to
>>> >>>>>> foldl' to work on some types where addition is strict,
>>> >>>>>> Option A:
>>> >>>>>>
>>> >>>>>>     sum = foldl' (+) 0
>>> >>>>>>
>>> >>>>>> or alternatively (to make people using lazy
>>> >>>>>> accumulator types), Option B:
>>> >>>>>>
>>> >>>>>>     sum = foldr (+) 0
>>> >>>>>>
>>> >>>>>> The current state is no good for anyone or anything.
>>> >>>>>>
>>> >>>>>> ---
>>> >>>>>>
>>> >>>>>> Related issue which Hecate didn't clearly mention, is
>>> >>>>>> that Foldable class default implementation has
>>> >>>>>>
>>> >>>>>>    class Foldable f where
>>> >>>>>>        ...
>>> >>>>>>        sum = getSum . foldMap Sum -- this is "good" lazy definition
>>> 
>>> >>>>>>
>>> >>>>>> If we select option A, then I argue that for
>>> >>>>>> consistency the default `Foldable.sum` should be
>>> >>>>>>
>>> >>>>>>        sum = getSum . foldMap' Sum -- strict foldMap'
>>> >>>>>>
>>> >>>>>> If we select option B, Foldable definition doesn't need to be
>>> changed.
>>> >>>>>>
>>> >>>>>> ---
>>> >>>>>>
>>> >>>>>> I repeat, using non-strict left fold, foldl, for sum
>>> >>>>>> and product is not good for anything.
>>> >>>>>> Either foldr or foldl'.
>>> >>>>>>
>>> >>>>>> I have no strong preference. Current state is unacceptable.
>>> >>>>>>
>>> >>>>>> -  Oleg
>>> >>>>>>
>>> >>>>>> On 18.10.2020 22.24, Henning Thielemann wrote:
>>> >>>>>>>
>>> >>>>>>> On Sun, 18 Oct 2020, Hécate wrote:
>>> >>>>>>>
>>> >>>>>>>> In conclusion, leaving things to the optimiser that
>>> >>>>>>>> could be trivially made fast every time seems
>>> >>>>>>>> needlessly risky.
>>> >>>>>>>
>>> >>>>>>> `seq` is still a hack. A strict 'sum' and 'product'
>>> >>>>>>> would still fail on a lazy accumulator type, say a
>>> >>>>>>> lazy pair type. If at all, sum and product should be
>>> >>>>>>> deepseq-strict. So currently, letting the optimiser
>>> >>>>>>> make a lazy sum strict is still the smaller hack.
>>> >>>>>>>
>>> >>>>>>> _______________________________________________
>>> >>>>>>> Libraries mailing list
>>> >>>>>>> Libraries@ haskell. org ( Libraries at haskell.org )
>>> >>>>>>> http:/ / mail. haskell. org/ cgi-bin/ mailman/ listinfo/ libraries
>>> ( http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries )
>>> >>>>>>
>>> >>>>>> _______________________________________________
>>> >>>>>> Libraries mailing list
>>> >>>>>> Libraries@ haskell. org ( Libraries at haskell.org )
>>> >>>>>> http:/ / mail. haskell. org/ cgi-bin/ mailman/ listinfo/ libraries (
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries )
>>> >>>>>
>>> >>>>> _______________________________________________
>>> >>>>> Libraries mailing list
>>> >>>>> Libraries@ haskell. org ( Libraries at haskell.org )
>>> >>>>> http:/ / mail. haskell. org/ cgi-bin/ mailman/ listinfo/ libraries (
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries )
>>> >>>>
>>> >>>> _______________________________________________
>>> >>>> Libraries mailing list
>>> >>>> Libraries@ haskell. org ( Libraries at haskell.org )
>>> >>>> http:/ / mail. haskell. org/ cgi-bin/ mailman/ listinfo/ libraries (
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries )
>>> >>> --
>>> >>> Hécate ✨
>>> >>> IRC: Uniaika
>>> >>> WWW: https:/ / glitchbra. in ( https://glitchbra.in )
>>> >>> RUN: BSD
>>> >>>
>>> >>> _______________________________________________
>>> >>> Libraries mailing list
>>> >>> Libraries@ haskell. org ( Libraries at haskell.org )
>>> >>> http:/ / mail. haskell. org/ cgi-bin/ mailman/ listinfo/ libraries (
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries )
>>> >>
>>> >> _______________________________________________
>>> >> Libraries mailing list
>>> >> Libraries@ haskell. org ( Libraries at haskell.org )
>>> >> http:/ / mail. haskell. org/ cgi-bin/ mailman/ listinfo/ libraries (
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries )
>>> > _______________________________________________
>>> > Libraries mailing list
>>> > Libraries@ haskell. org ( Libraries at haskell.org )
>>> > http:/ / mail. haskell. org/ cgi-bin/ mailman/ listinfo/ libraries (
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries )
>>> 
>>> --
>>> Jón Fairbairn Jon. Fairbairn@ cl. cam. ac. uk ( Jon.Fairbairn at cl.cam.ac.uk
>>> )
>>> http:/ / www. chaos. org. uk/ ~jf/ Stuff-I-dont-want. html (
>>> http://www.chaos.org.uk/~jf/Stuff-I-dont-want.html ) (updated 2014-04-05)
>>> 
>>> _______________________________________________
>>> Libraries mailing list
>>> Libraries@ haskell. org ( Libraries at haskell.org )
>>> http:/ / mail. haskell. org/ cgi-bin/ mailman/ listinfo/ libraries (
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries )
>>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> David Duke
>> Emeritus Professor of Computer Science
>> School of Computing University of Leeds UK
>> E:duke. j. david@ gmail. com ( E%3Aduke.j.david at gmail.com )
>> W: https:/ / engineering. leeds. ac. uk/ staff/ 334/ Professor_David_Duke (
>> https://engineering.leeds.ac.uk/staff/334/Professor_David_Duke )
>> _______________________________________________
>> Libraries mailing list
>> Libraries@ haskell. org ( Libraries at haskell.org )
>> http:/ / mail. haskell. org/ cgi-bin/ mailman/ listinfo/ libraries (
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries )
>> 
> 
> 
> 
> 
> _______________________________________________
> Libraries mailing list
> Libraries@ haskell. org ( Libraries at haskell.org )
> http:/ / mail. haskell. org/ cgi-bin/ mailman/ listinfo/ libraries (
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries )
> 
> 
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20201027/e12646cf/attachment.html>


More information about the Libraries mailing list