[Haskell-cafe] Two-iteration optimisation (was: GHC Predictability)
Paul Johnson
paul at cogito.org.uk
Wed May 14 13:08:28 EDT 2008
We've had a big debate over
> mean xs = foldl' (+) 0 xs / fromIntegral (length xs)
For anyone who didn't follow it, the problem is that "mean" needs to
traverse its argument twice, so the entire list has to be held in
memory. So if "xs = [1..1000000000]" then "mean xs" uses all your
memory, but "foldl' (+) 0 xs" and "length xs" by themselves do not.
The solution is for the programmer to rewrite "mean" to accumulate a
pair containing the running total and count together, then do the
division. This makes me wonder: could there be a compiler optimisation
rule for this, collapsing two iterations over a list into one. I've
never tried writing GHC rules, but something like:
> f (foldl' g x) (foldl' h x) = (uncurry f) (foldl' (\i (gt,ht) -> (g
i gt, h i ht)))
The first problem that occurs to me is the number of permutations of
fold and map functions.
Paul.
More information about the Haskell-Cafe
mailing list