until, laziness, and performance

Sun Mar 27 22:44:45 UTC 2016

On Sun, Mar 27, 2016 at 4:42 PM, George Colpitts
<george.colpitts at gmail.com> wrote:
> I still don't understand why
>
> as' i s
>      | i >= n    = s
>      | otherwise = as' (i + 2) (s + (-1) / i + 1 / (i + 1))
>
>
> is not lazy in its second arg as it only has to evaluate it if i >= n. This
> seems exactly analogous to or, ||, which only needs to evaluate its second
> arg if its first arg is False. || is definitely lazy in its second argument.
> Can you explain why as' is strict in its second arg, s? Is it due to the
> following from,
> https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/Demand ?

The simplest way to see why as' is strict in s is to use the actual
definition of strictness, and not think about operational semantics.
The definition is that a function f is strict if `f undefined =
undefined`. So, what is `as' i undefined`?

If `i >= n` then `as' i undefined = undefined`
if not `i >= n` then `as' i undefined = as' (i+2) undefined`

The second case holds because Double arithmetic is strict, so the
complex expression will be undefined because s is.

>From these cases we can reason that the result will always be
undefined. Either the value of i will grow such that the first case
succeeds, or if that will never happen (if n is NaN for instance), the
loop will continue forever, in which case the result is equivalent to
undefined semantically.

Because of this denotational analysis, we are at liberty to make the
operation of as' such that it always evaluates s immediately, because
it won't change the answer. And that is the optimization that saves
time/space.

By contrast, my alternate function:

    as' i s
      | i >= n = (i, s)
      | otherwise = as' (i+2) (...)

is easily not strict in s, because:

    as' i undefined = (i, undefined)

when `i >= n` is True, and (i, undefined) is not undefined. So it is
invalid to perform the same optimization, because it would change the
answer on some inputs. We happen to know that all the places the
function is used, those two answers would be equivalent (because we
just project out the second item), but that's somewhat difficult,
non-local reasoning.

Note that GHC does do some optimization. The loop in until is (in this
case) a loop on a pair, and it figures out that the loop is strict in
the pair, and eliminates the construction of the pair (by passing two
separate arguments, and returning an unboxed pair from the loop).
Optimizing the pieces of the pair would have to be a sort of
speculative operation (without doing a global analysis), where 2^k
different versions of the loop are generated (where k is the number of
non-strict pieces), and the call sites are optimized to call the best
of those versions for them. But 2^k isn't a good looking number for
this sort of thing.

-- Dan