[Proposal] Strict `sum` and `product`

Wander Hillen wjw.hillen at gmail.com
Thu Oct 22 10:38:32 UTC 2020


I don't think this example shows that a lazy fold would be better. Even if
the combining argument of the fold is not strict in the second argument,
evaluating something like "sum [Infinity, 1, 2, 3]" with the
infinity-shorting double will still touch every element in the list. As I
understand it, to actually benefit from laziness in these functions they'd
need both a lazy combining function AND a lazy accumulator type (which
Double is not). For a practical example of where a shortcut might be
possible, consider a case like "if (product [1,2,3,0,4,5]) < 20 then f x
else g y". Evaluation of the list could theoretically stop after it
encounters the zero [0] since after that any further elements in the list
will no longer change the value of the outcome. I don't think that this
actually occurs however and doubt the various fold functions could be
rewritten in such a way as to handle this in the general case as long as
the accumulator type is strict. As was mentioned earlier, all of the Num
types in base are strict and both `sum` and `product` have a Num
constraint. If lazy shortcutting is important, a small custom function that
recurses over the list and tests for problem-specific edge cases will still
be possible.

I disagree btw with the statement that we should not change this even
though nobody can think of a compelling reason not to because we might
think of such a reason later. That standpoint in its extreme can only lead
to paralysis, since it is not always possible to prove no reasons will be
thought of later.

Davids mail came in as I was writing this, I do agree with many of his
points regarding laziness but would like to point out that as Haskell
was/is intended to explore laziness in programming, we should eventually be
able to get the benefits from these explorations as well. Negative results
can be as valuable as positive results after all, so if
"we-the-library-writers" discover a place where laziness is not ideal it
should be possible (encouraged even) to apply this learning and change our
language for the better. In this specific case (sum and product only), the
use of strict accumulator types results in strict behavior already and
therefore we lose nothing by changing from lazy to strict foldl. The
proposal is NOT about removing lazy folds in general, just these two
particular cases where they do not bring any benefits but do bring space
leaks.

Wander

[0]: This is an optimization we see in the implementation of Integer
multiplication
<https://hackage.haskell.org/package/integer-gmp-1.0.3.0/docs/src/GHC.Integer.Type.html#timesInteger>.
It does not work for `product` though, only for `(*)`.

Op do 22 okt. 2020 om 11:36 schreef Jon Fairbairn <
jon.fairbairn at cl.cam.ac.uk>:

> Sylvain Henry <sylvain at haskus.fr> writes:
>
> > Do we have an actual example of the use of a lazy sum in the
> > wild?
>
> That’s not a very good argument, though I’ve seen it used quite
> a lot.  The problem is that just because no-one can currently
> think of a “real” example it doesn’t mean that there are no
> cases where laziness is required.
>
> For a forced example, consider a wrapping of Double with a test
> for infinity on the left argument
>
>>  a + b | isInfinite a = a
>>
> which doesn’t evaluate it’s second argument if the first is
> infinite.
>
> > If the most common case is the strict one, making `sum`
> > strict would be a better default.
>
> Haskell should remain a lazy language, so the defaults should be
> lazy unless they can be proved strict anyway.
>
> > If need be we could also provide `lazySum` or something, but
> > is there really a need?
>
> Who can truly say? I’d support sum' as the strict version as
> that is consistent with other usages.
>
>   — Jón
>
> >
> > Sylvain
> >
> >
> > On 18/10/2020 22:16, Oleg Grenrus wrote:
> >>
> >> Sorry I wasn't clear myself. Option C is "add sum'"
> >>
> >> For lists:
> >>
> >>     sum  = foldr  (+) 0
> >>     sum' = foldl' (+) 0
> >>
> >> For Foldable
> >>
> >>     sum  = getSum #. foldMap  Sum
> >>     sum' = getSum #. foldMap' Sum
> >>
> >> - Oleg
> >>
> >> On 18.10.2020 23.11, Hécate wrote:
> >>>
> >>> Indeed, and I initially went to suggest a
> >>> foldMap'`-based implementation to keep with the current
> >>> implementation of many Foldable functions that are based
> >>> on `foldMap` rather than a raw `fold`.
> >>>
> >>> On 18/10/2020 22:04, Oleg Grenrus wrote:
> >>>> For the sake of bigger audience I didn't bother
> >>>> mentioning #. which is a coercion helper. It's
> >>>> essentially better (.) when the first argument is
> >>>> newtype constructor (i.e. coerce).
> >>>>
> >>>> So with Option A (strict):
> >>>>
> >>>>     sum = getSum #. foldMap' Sum
> >>>>
> >>>> Or Option B (lazy)
> >>>>
> >>>>     sum = getSum #. foldMap Sum
> >>>>
> >>>> ---
> >>>>
> >>>> There is also third option, Option C:
> >>>>
> >>>>     sum  = foldr  (+) 0
> >>>>     sum' = foldl' (+) 0
> >>>>
> >>>> I don't think this is worthwhile, but it is an option.
> >>>> (to rehash, I don't consider maintaining status quo to
> >>>> be an option at all).
> >>>>
> >>>> - Oleg
> >>>> On 18.10.2020 22.54, Vanessa McHale wrote:
> >>>>>
> >>>>> It's
> >>>>>
> >>>>> sum = getSum #. foldMap Sum
> >>>>>
> >>>>> in base.
> >>>>>
> >>>>> On 10/18/20 2:49 PM, Oleg Grenrus wrote:
> >>>>>>
> >>>>>> The problem is the current definition of sum for lists
> >>>>>> which uses foldl, i.e non-strict left fold
> >>>>>>
> >>>>>>     sum = foldl (+) 0
> >>>>>>
> >>>>>> It's utterly broken. Either we should change it to
> >>>>>> foldl' to work on some types where addition is strict,
> >>>>>> Option A:
> >>>>>>
> >>>>>>     sum = foldl' (+) 0
> >>>>>>
> >>>>>> or alternatively (to make people using lazy
> >>>>>> accumulator types), Option B:
> >>>>>>
> >>>>>>     sum = foldr (+) 0
> >>>>>>
> >>>>>> The current state is no good for anyone or anything.
> >>>>>>
> >>>>>> ---
> >>>>>>
> >>>>>> Related issue which Hecate didn't clearly mention, is
> >>>>>> that Foldable class default implementation has
> >>>>>>
> >>>>>>    class Foldable f where
> >>>>>>        ...
> >>>>>>        sum = getSum . foldMap Sum -- this is "good" lazy definition
> >>>>>>
> >>>>>> If we select option A, then I argue that for
> >>>>>> consistency the default `Foldable.sum` should be
> >>>>>>
> >>>>>>        sum = getSum . foldMap' Sum -- strict foldMap'
> >>>>>>
> >>>>>> If we select option B, Foldable definition doesn't need to be
> changed.
> >>>>>>
> >>>>>> ---
> >>>>>>
> >>>>>> I repeat, using non-strict left fold, foldl, for sum
> >>>>>> and product is not good for anything.
> >>>>>> Either foldr or foldl'.
> >>>>>>
> >>>>>> I have no strong preference. Current state is unacceptable.
> >>>>>>
> >>>>>> -  Oleg
> >>>>>>
> >>>>>> On 18.10.2020 22.24, Henning Thielemann wrote:
> >>>>>>>
> >>>>>>> On Sun, 18 Oct 2020, Hécate wrote:
> >>>>>>>
> >>>>>>>> In conclusion, leaving things to the optimiser that
> >>>>>>>> could be trivially made fast every time seems
> >>>>>>>> needlessly risky.
> >>>>>>>
> >>>>>>> `seq` is still a hack. A strict 'sum' and 'product'
> >>>>>>> would still fail on a lazy accumulator type, say a
> >>>>>>> lazy pair type. If at all, sum and product should be
> >>>>>>> deepseq-strict. So currently, letting the optimiser
> >>>>>>> make a lazy sum strict is still the smaller hack.
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> Libraries mailing list
> >>>>>>> Libraries at haskell.org
> >>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> Libraries mailing list
> >>>>>> Libraries at haskell.org
> >>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
> >>>>>
> >>>>> _______________________________________________
> >>>>> Libraries mailing list
> >>>>> Libraries at haskell.org
> >>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
> >>>>
> >>>> _______________________________________________
> >>>> Libraries mailing list
> >>>> Libraries at haskell.org
> >>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
> >>> --
> >>> Hécate ✨
> >>> IRC: Uniaika
> >>> WWW:https://glitchbra.in
> >>> RUN: BSD
> >>>
> >>> _______________________________________________
> >>> Libraries mailing list
> >>> Libraries at haskell.org
> >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
> >>
> >> _______________________________________________
> >> Libraries mailing list
> >> Libraries at haskell.org
> >> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
> > _______________________________________________
> > Libraries mailing list
> > Libraries at haskell.org
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
> --
> Jón Fairbairn                                 Jon.Fairbairn at cl.cam.ac.uk
> http://www.chaos.org.uk/~jf/Stuff-I-dont-want.html  (updated 2014-04-05)
>
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20201022/b7ea9b22/attachment.html>


More information about the Libraries mailing list