[Haskell-cafe] Re: Don't “accidentallyparallelize”

Sun Sep 6 03:08:14 EDT 2009

On Sunday 06 September 2009 2:18:31 am David Menendez wrote:
> On Sat, Sep 5, 2009 at 7:57 PM, Dan Doel<dan.doel at gmail.com> wrote:
> > I suppose technically, what foldl' has over foldl is that it is more
> > readily subject to optimization. Each recursive call is artificially made
> > strict in the accumulator, so it is legal for GHC to optimize the
> > function by keeping the accumulator evaluated, instead of delaying it.
> > When GHC is run with optimizations on, it does analysis on all code that
> > tries to determine such things, and seq can be seen as making such
> > analysis easier for the compiler.
> 
> It turns out, pseq limits the effectiveness of strictness analysis,
> because it forces the order of evaluation. John Meacham described this
> pretty well last week in the Haskell' list
> <http://www.haskell.org/pipermail/haskell-prime/2009-August/003006.html>.

Interesting. I hadn't thought of this before, but it certainly makes sense.

> > This is, of course, not what really happens in GHC. What really happens
> > is that the first argument to seq is evaluated before the second (which
> > is why it even has the intended effect when optimizations aren't on). But
> > that doesn't have to be the case, strictly speaking.
> 
> It's entirely possible for optimized code to end up evaluating the
> second argument to seq before the first.

Yeah, I suppose I should have been more precise. In the absence of 
optimizations, I assume seq translates to core something like:

  seq x y = case x of { z -> y }

where the core version of case evaluates things regardless of patterns and 
such. That explains why foldl' works even if you don't compile it with 
optimizations. But yes, it wouldn't surprise me if the optimizer rearranged 
the above, since the case statement is a no-op other than forcing x, and it 
might see ways to better evaluate things without altering the results.

By contrast, pseq would be like the above, but the optimizer would be unable 
to rewrite it.

Perhaps that's still misleading, though. The difference between them is rather 
like the difference between:

  xor False False = False
  xor False True  = True
  xor True  False = True
  xor True  True  = False

(verbose definition meant to emphasize the symmetry of the arguments) and

  or True  _ = True
  or False b = b

xor is strict in both its arguments, whereas or is strict in its first 
argument, and only strict in its second argument if its first argument is 
False. Similarly, seq is meant to be strict in both its arguments, whereas 
pseq needs to be strict in its first argument, and only strict in its second 
argument if its first argument is non-bottom.

-- Dan