Cale Gibbard cgibbard at gmail.com
Tue Nov 1 18:38:01 EST 2005

```On 01/11/05, Sebastian Sylvan <sebastian.sylvan at gmail.com> wrote:
> > I was wondering... In my experience, it's worked much better to use
> >
> > sum' = foldl' (+) 0
> >
> > than the built-in "sum" function, which leaks memory like crazy for
> > large input lists. I'm guessing the built-in definition is
> >
> > sum = foldr (+) 0
> >
> > But as far as I know, (+) is always strict, so foldl' seems much more
> > natural to me. Is there a case where the build-in definition is
> > preferable?
>
> The library definition in ghc actually uses foldl.
> It's conceivable that you may not want (+) to be non-strict for
> certain data types.
> The question then becomes, is there a case where you want _sum_ to be
> non-strict?
>

I'd argue that the likely answer is no, given that (+) is an operation
in an Abelian group and not a monoid. Regardless of whether (+) is
strict, when you compute a sum you will always have to consume the
entire list. This is because until you have observed the last element
of the list, nothing can be said about the final result. To see this,
it's enough to see that even if the sum of all the other elements of
the list is g, the final element could be -g + h, which would give a
final result of h, regardless of what g is.

On the other hand, you don't always want product to be strict, since
(*) is a monoid operation, and so you actually can have information
about the final result long before the list is completely consumed. As
a simple example, consider computing the product of [0..10000000].
Unfortunately, giving a general definition to product seems to
preclude the possibility of allowing for this kind of optimisation in
general, since in general, ring elements aren't comparable for
equality (though Eq is a superclass of Num, you can't always define
(==) sensibly) and even if you can test for equality, without knowing