[Haskell-cafe] Question on optimization

Viktor Dukhovni ietf-dane at dukhovni.org
Tue Dec 15 23:13:43 UTC 2020


On Tue, Dec 15, 2020 at 02:05:13PM -0800, Todd Wilson wrote:

> In code of the form
>     case as ++ bs of
>       [] -> ...
>       x:xs -> ...
>
> under what circumstances is the overhead of cons-cell creation for the
> elements of `as` that would happen in an eager language avoided?

In Haskell, even without optimisation, I'd expect the above to
run in constant space and time regardless of the length of `as`.

> Always? Is the answer the same if `f` is defined by simple list
> recursion and we call `f (as ++ bs)`? What if `f` is defined in a
> different module (or comes from the Prelude)?

For example, ghci with no optimisation:

    λ> let f n = [n..n*n]
    λ> case f 1000000 ++ f 10 of { [] -> "empty"; x:xs -> show x }
    "1000000"
    (0.01 secs, 69,680 bytes)
    λ> case f 10000000000000 ++ f 10 of { [] -> "empty"; x:xs -> show x }
    "10000000000000"
    (0.01 secs, 75,384 bytes)

    λ> let x = (1 : undefined) ++ [2]
    (0.01 secs, 63,200 bytes)
    λ> case x of { [] -> "empty"; x:xs -> show x }
    "1"
    (0.01 secs, 64,808 bytes)

So at least with "++" the result is expected to be lazy in the tail of
`as`.  If you interpose some list-generating function:

    f :: [a] -> [b]

called as `f (as ++ bs)`, the result rather depends on how strict `f` is
in the tail of its input.  If `f` can return an initial segment of its
output from just an initial segment of its input, then the same sort
of constant space/time behaviour would be expected from `f`.

> by looking at the (intermediate) code generated by the compiler, but
> more generally, are there some succinct principles that everyday
> programmers can hang on their cubicle walls that summarize a lot of
> what they can expect from the compiler in terms of optimizations?

This feels more like strict/lazy question to me, than optimised vs.
unoptimised.  Haskell promises lazy evaluation even when unoptimised.

    λ> head $ map (+1) $ (1 : undefined) ++ [2]
    2

Here, `map` is but one example of an `f` that is lazy in the tail of its
input:

    map _ [] = []
    map f (x:xs) = f x : map f xs

Generally, any list-to-list function that returns incrementally
consumable output given an infinite list, or a usable initial segment
from a list with a divergent term is one that necessarily only
evaluates some finite initial segment of its input, and is not
sensitive to what lies beyond.

A naive mental model for (++) is then:

    [] ++ bs = bs
    (a:as) ++ bs = a : (as ++ bs)

which is easily seen to be lazy in `as`.

-- 
    Viktor.


More information about the Haskell-Cafe mailing list