[Proposal] Strict `sum` and `product`

Fumiaki Kinoshita fumiexcel at gmail.com
Thu Oct 22 11:05:02 UTC 2020


>
> For a forced example, consider a wrapping of Double with a test
> for infinity on the left argument
>
>>  a + b | isInfinite a = a
>>

This makes sense only if sum is foldr (+) 0, while the implementation of
sum for lists is foldl (+) 0. But hey, we are comparing a hypothetical type
with hundreds of real world uses.

Haskell should remain a lazy language
>
I'm not sure if I agree. We've been bitten by laziness too much, hence the
introduction of BangPatterns, StrictData, deepseq, etc. IMO Haskell
ecosystem should be more strict.

I'm uncomfortable with making foldMap/sum strict.
>

No one is suggesting to change foldMap; what OP wants to change is sum and
product, and foldMap _happens to be_ the current default implementation.

> we dont knowing the context. the best way to avoid the cost of
> computation is not to compute it; what alternatives are there
> i this s[ecific problem.
>

The context is pretty clear; the status quo (sum = foldl (+) 0) is flat-out
wrong. I'd say more than 80% of the uses of sum is supposed to look at all
the elements. And even if (+) is short-circuiting in the way Jon suggested,
lazy foldl does not take any advantage of it.

if one was to make foldMap/sum strict why not also other functions, e.g.
> scanAccum, where does one stop on the grounds of possibly saving costs in
> so as yet
> unknown application
>
I'm not sure which function you are talking about (there's no function
called scanAccum in base nor containers). If you mean scanl, lazy
accumulator still makes sense because the resulting list can be consumed
incrementally. foldl is not.

Instead of thinking/programming  in a pure lazy language, you are then in
> the position of thinking lazily apart from some set of
> to be determined exceptional cases. the simplicity and uniformity of the
> model is gone.
>
Again, there's little to be gained from lazy foldl. It's avoided in general
and I'd say it's not "uniform" to use the function.

4. As per item 3 but from a pedagogical point of view. I taught UG  FP for
> years. Complicating the model is not going to help students,
> to say nothing of the caveats that would have to be added to
> existing teaching materials.
>
Also from a pedagogical point of view, a number of beginners fall into a
trap of lazy foldl. Caveats should be added to the current one if at all,
but this danger zone can be removed with one byte change.

I'm seriously worried that the use of foldl in the standard library might
make beginners think that it's fine to use foldl.

As far as I remember, the same change has been suggested a while ago but
didn't make it. I hope we don't get discouraged this time and go ahead with
the change.

2020年10月22日(木) 19:19 David Duke <duke.j.david at gmail.com>:

> Concur with Jon's point.
> I'm uncomfortable with making foldMap/sum strict.
> Haskell was intended to explore a region of the programming design space
> that had not been well covered. and it remains
> if not unique  then at least exemplary in that respect. When  I write
> in Haskell I do so with the knowledge that I am working with
> lazy evaluation with the benefits and risk that comes with that. If I'm
> desperate to have finer operational control, I  either
> swallow that pill and adapt my use of Haskell accordingly, or pick up a
> different tool from the programming language toolbox.
>
> Now the OP pointed to an issue with one specific use of sum. However, that
> seems like weak grounds to make library changes:
> 1. we dont knowing the context. the best way to avoid the cost of
> computation is not to compute it; what alternatives are there
> i this s[ecific problem.
> 2. if one was to make foldMap/sum strict why not also other functions,
> e.g. scanAccum, where does one stop on the grounds of possibly saving costs
> in so as yet
> unknown application
> 3. Instead of thinking/programming  in a pure lazy language, you are then
> in the position of thinking lazily apart from some set of
> to be determined exceptional cases. the simplicity and uniformity of the
> model is gone.
> 4. As per item 3 but from a pedagogical point of view. I taught UG  FP for
> years. Complicating the model is not going to help students,
> to say nothing of the caveats that would have to be added to
> existing teaching materials.
> So whilst  the post was clearly intended to trigger discussion, in terms
> on conveying sentiment a strict  foldMap would be a big '-'
> for me personally. Whilst there are performance challenges in working
>  with Haskell. The primary role of the programming notation
> is allowing one to think about the problem and obtain a clear, correct
> solution. Performance tuning comes later if at all, and
> her are known techniques for doing so  with Haskell/ghc.
> 5. OTOH I see little   harm in adding a library that provides a strict
> foldMap for those
> who want to use it after  deliberate thought.
>
> YMMMocV
> David
>
>
>
>
>
>
>
> On Thu, Oct 22, 2020 at 10:36 AM Jon Fairbairn <jon.fairbairn at cl.cam.ac.uk>
> wrote:
>
>> Sylvain Henry <sylvain at haskus.fr> writes:
>>
>> > Do we have an actual example of the use of a lazy sum in the
>> > wild?
>>
>> That’s not a very good argument, though I’ve seen it used quite
>> a lot.  The problem is that just because no-one can currently
>> think of a “real” example it doesn’t mean that there are no
>> cases where laziness is required.
>>
>> For a forced example, consider a wrapping of Double with a test
>> for infinity on the left argument
>>
>>>>  a + b | isInfinite a = a
>>>>
>> which doesn’t evaluate it’s second argument if the first is
>> infinite.
>>
>> > If the most common case is the strict one, making `sum`
>> > strict would be a better default.
>>
>> Haskell should remain a lazy language, so the defaults should be
>> lazy unless they can be proved strict anyway.
>>
>> > If need be we could also provide `lazySum` or something, but
>> > is there really a need?
>>
>> Who can truly say? I’d support sum' as the strict version as
>> that is consistent with other usages.
>>
>>   — Jón
>>
>> >
>> > Sylvain
>> >
>> >
>> > On 18/10/2020 22:16, Oleg Grenrus wrote:
>> >>
>> >> Sorry I wasn't clear myself. Option C is "add sum'"
>> >>
>> >> For lists:
>> >>
>> >>     sum  = foldr  (+) 0
>> >>     sum' = foldl' (+) 0
>> >>
>> >> For Foldable
>> >>
>> >>     sum  = getSum #. foldMap  Sum
>> >>     sum' = getSum #. foldMap' Sum
>> >>
>> >> - Oleg
>> >>
>> >> On 18.10.2020 23.11, Hécate wrote:
>> >>>
>> >>> Indeed, and I initially went to suggest a
>> >>> foldMap'`-based implementation to keep with the current
>> >>> implementation of many Foldable functions that are based
>> >>> on `foldMap` rather than a raw `fold`.
>> >>>
>> >>> On 18/10/2020 22:04, Oleg Grenrus wrote:
>> >>>> For the sake of bigger audience I didn't bother
>> >>>> mentioning #. which is a coercion helper. It's
>> >>>> essentially better (.) when the first argument is
>> >>>> newtype constructor (i.e. coerce).
>> >>>>
>> >>>> So with Option A (strict):
>> >>>>
>> >>>>     sum = getSum #. foldMap' Sum
>> >>>>
>> >>>> Or Option B (lazy)
>> >>>>
>> >>>>     sum = getSum #. foldMap Sum
>> >>>>
>> >>>> ---
>> >>>>
>> >>>> There is also third option, Option C:
>> >>>>
>> >>>>     sum  = foldr  (+) 0
>> >>>>     sum' = foldl' (+) 0
>> >>>>
>> >>>> I don't think this is worthwhile, but it is an option.
>> >>>> (to rehash, I don't consider maintaining status quo to
>> >>>> be an option at all).
>> >>>>
>> >>>> - Oleg
>> >>>> On 18.10.2020 22.54, Vanessa McHale wrote:
>> >>>>>
>> >>>>> It's
>> >>>>>
>> >>>>> sum = getSum #. foldMap Sum
>> >>>>>
>> >>>>> in base.
>> >>>>>
>> >>>>> On 10/18/20 2:49 PM, Oleg Grenrus wrote:
>> >>>>>>
>> >>>>>> The problem is the current definition of sum for lists
>> >>>>>> which uses foldl, i.e non-strict left fold
>> >>>>>>
>> >>>>>>     sum = foldl (+) 0
>> >>>>>>
>> >>>>>> It's utterly broken. Either we should change it to
>> >>>>>> foldl' to work on some types where addition is strict,
>> >>>>>> Option A:
>> >>>>>>
>> >>>>>>     sum = foldl' (+) 0
>> >>>>>>
>> >>>>>> or alternatively (to make people using lazy
>> >>>>>> accumulator types), Option B:
>> >>>>>>
>> >>>>>>     sum = foldr (+) 0
>> >>>>>>
>> >>>>>> The current state is no good for anyone or anything.
>> >>>>>>
>> >>>>>> ---
>> >>>>>>
>> >>>>>> Related issue which Hecate didn't clearly mention, is
>> >>>>>> that Foldable class default implementation has
>> >>>>>>
>> >>>>>>    class Foldable f where
>> >>>>>>        ...
>> >>>>>>        sum = getSum . foldMap Sum -- this is "good" lazy definition
>> >>>>>>
>> >>>>>> If we select option A, then I argue that for
>> >>>>>> consistency the default `Foldable.sum` should be
>> >>>>>>
>> >>>>>>        sum = getSum . foldMap' Sum -- strict foldMap'
>> >>>>>>
>> >>>>>> If we select option B, Foldable definition doesn't need to be
>> changed.
>> >>>>>>
>> >>>>>> ---
>> >>>>>>
>> >>>>>> I repeat, using non-strict left fold, foldl, for sum
>> >>>>>> and product is not good for anything.
>> >>>>>> Either foldr or foldl'.
>> >>>>>>
>> >>>>>> I have no strong preference. Current state is unacceptable.
>> >>>>>>
>> >>>>>> -  Oleg
>> >>>>>>
>> >>>>>> On 18.10.2020 22.24, Henning Thielemann wrote:
>> >>>>>>>
>> >>>>>>> On Sun, 18 Oct 2020, Hécate wrote:
>> >>>>>>>
>> >>>>>>>> In conclusion, leaving things to the optimiser that
>> >>>>>>>> could be trivially made fast every time seems
>> >>>>>>>> needlessly risky.
>> >>>>>>>
>> >>>>>>> `seq` is still a hack. A strict 'sum' and 'product'
>> >>>>>>> would still fail on a lazy accumulator type, say a
>> >>>>>>> lazy pair type. If at all, sum and product should be
>> >>>>>>> deepseq-strict. So currently, letting the optimiser
>> >>>>>>> make a lazy sum strict is still the smaller hack.
>> >>>>>>>
>> >>>>>>> _______________________________________________
>> >>>>>>> Libraries mailing list
>> >>>>>>> Libraries at haskell.org
>> >>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>> >>>>>>
>> >>>>>> _______________________________________________
>> >>>>>> Libraries mailing list
>> >>>>>> Libraries at haskell.org
>> >>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> Libraries mailing list
>> >>>>> Libraries at haskell.org
>> >>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>> >>>>
>> >>>> _______________________________________________
>> >>>> Libraries mailing list
>> >>>> Libraries at haskell.org
>> >>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>> >>> --
>> >>> Hécate ✨
>> >>> IRC: Uniaika
>> >>> WWW:https://glitchbra.in
>> >>> RUN: BSD
>> >>>
>> >>> _______________________________________________
>> >>> Libraries mailing list
>> >>> Libraries at haskell.org
>> >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>> >>
>> >> _______________________________________________
>> >> Libraries mailing list
>> >> Libraries at haskell.org
>> >> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>> > _______________________________________________
>> > Libraries mailing list
>> > Libraries at haskell.org
>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>
>> --
>> Jón Fairbairn                                 Jon.Fairbairn at cl.cam.ac.uk
>> http://www.chaos.org.uk/~jf/Stuff-I-dont-want.html  (updated 2014-04-05)
>>
>> _______________________________________________
>> Libraries mailing list
>> Libraries at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>
>
>
> --
> David Duke
> Emeritus Professor of Computer Science
> School of Computing University of Leeds UK
> E:duke.j.david at gmail.com
> W:https://engineering.leeds.ac.uk/staff/334/Professor_David_Duke
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20201022/1ccc51a6/attachment.html>


More information about the Libraries mailing list