[Haskell-beginners] Re: folds -- help!

Tue Mar 10 15:48:52 EDT 2009

Kurt Hutchinson <kelanslists <at> gmail.com> writes:
> You're thinking of a slightly different 'empty' here. You're thinking
> of what happens when you reach the end of the list, after folding it
> all, and there are no more elements to fold. In this example, you're
> right that the acc parameter is 6. But what if the list you *first
> gave to foldr* was empty? Then it would evaluate to 0, the initial
> seed value.
> 

Ok.

>> if f can start delivering the result without looking at its second argument,
>> you can start consuming the result before the fold has traversed the whole
>> list.
> >>
> >
> > Ok, that isn't clearly illustrated by the example in the book:
> >
> >> > foldl (+) 0 (1:2:3:[])
> >> >    == foldl (+) (0 + 1)             (2:3:[])
> >> >    == foldl (+) ((0 + 1) + 2)       (3:[])
> >> >    == foldl (+) (((0 + 1) + 2) + 3) []
> >> >    ==           (((0 + 1) + 2) + 3)
> >> >
> >> >
> >> > foldr (+) 0 (1:2:3:[])
> >> >    ==  1 +           foldr (+) 0 (2:3:[])
> >> >    ==  1 + (2 +      foldr (+) 0 (3:[])
> >> >    ==  1 + (2 + (3 + foldr (+) 0 []))
> >> >    ==  1 + (2 + (3 + 0))
> >> >
> >
> > In that example, it doesn't look like anything in foldr can be evaluated
> > until the whole fold has been completed.
> 
> You're right, that example doesn't show how you could start using the
> result without fully evaluating the fold, since addition doesn't give
> partial results. The concat example is better in that regard.
> 

Ok. I guess I did understand something.  Therefore, I think 
the example in the book that uses addition with foldl and foldr is TERRIBLE.  
I think the sections on folds in RWH are a catastrophe and 
need to be rewritten.  The authors need to get rid of the "bit twiddling" 
example and provide an example like concat to clearly show the 
difference between foldl and foldr.  I think the example using foldl 
and foldr with addition should be a secondary example to demonstrate 
that sometimes it doesn't matter whether you use foldl or foldr.

> >> Common examples are things like
> >>
> >> concat = foldr (++) [],
> >> so
> >> concat [l1,l2,l3,l4,l5] = l1 ++ (foldr (++) [] [l2,l3,l4,l5])
> >> and the start (l1) can be used before further reducing the fold,
> >>
> >
> > So does haskell store a thunk for everything to the right of l1?
> > You said that when using foldr you can start "consuming" the beginning of 
> > the before the whole result is reduced.  I don't quite get that.
> 
> A thunk is used as a stand-in for most calculations before the result
> is actually calculated. That way, if you never try to use the result,
> the calculation never needs to be done, and that means less work. As
> an example that ties to the concat example above, say your program
> only wanted to test if the result of the concat fold was an empty
> list. The function 'null' takes a list and evaluates to True or False,
> based on whether the list is empty or not. So:
> 
> someFunc xs = null ( concat xs )
>   where
>   concat ys = foldr (++) [] ys
> 
> The 'null' function only needs to test whether the list that is the
> result of foldConcat has at least one element. Let's say l1 has an
> element. So it's kind of evaluated like this:
>   someFunc [ l1, l2, l3, l4, l5 ]
>   null (concat [ l1, l2, l3, l4, l5 ] )
>   null ( l1 ++ ( thunk with rest of fold ))
>   False
> 
> The rest of the fold doesn't need to be evaluated, since the beginning
> part is enough for 'null' to tell that the result would have at least
> one element (because l1 does). That's one way foldr can be used to
> start consuming the result before the entire fold is done. 
>

Ok.  The terminology is just a little confusing for me. When you say
"start consuming before the fold is done" it sounds to me like
somehow you are reading partial results off the front of the result
before foldr returns the entire result.  But as you explained above
what you actually mean is that foldr will return the entire result--
with part of the result being a thunk, and subsequently you can do 
operations on the result that may never require the thunked part to 
be evaluated.

> It depends
> completely on the accFunc: if it can return partial results, like
> concat, then you can start consuming the result before a full
> evaluation. Some accFunc's can't return partial results, like regular
> addition. In that case, it's probably better to use foldl' (note the
> apostrophe), which will force the thunks to be evaluated as they are
> generated and so use less memory. foldl is the same as foldl' except
> [foldl] does generate thunks, and then evaluates them all at the end of the
> fold, so it uses a bunch of memory to store the thunks in the
> meantime, which usually isn't useful.
> 
> Kurt
> 

Thanks to you and Daniel for the great explanations.  I'm feeling a lot
better about folds now.

....
....
......
....
...
...
....
....
....
....
...
...
...
....