[Haskell-cafe] To seq or not to seq, that is the question
Edward Z. Yang
ezyang at MIT.EDU
Sat Mar 9 05:53:15 CET 2013
Are these equivalent? If not, under what circumstances are they not
equivalent? When should you use each?
evaluate a >> return b
a `seq` return b
return (a `seq` b)
Furthermore, consider:
- Does the answer change when a = b? In such a case, is 'return $! b' permissible?
- What about when b = () (e.g. unit)?
- What about when 'return b' is some arbitrary monadic value?
- Does the underlying monad (e.g. if it is IO) make a difference?
- What if you use pseq instead of seq?
In http://hackage.haskell.org/trac/ghc/ticket/5129 we a bug in
'evaluate' deriving precisely from this confusion. Unfortunately, the
insights from this conversation were never distilled into a widely
publicized set of guidelines... largely because we never really figured
out was going on! The purpose of this thread is to figure out what is
really going on here, and develop a concrete set of guidelines which we
can disseminate widely. Here is one strawman answer (which is too
complicated to use in practice):
- Use 'evaluate' when you mean to say, "Evaluate this thunk to HNF
before doing any other IO actions, please." Use it as much as
possible in IO.
- Use 'return (a `seq` b)' for strictness concerns that have no
relation to the monad. It avoids unnecessary strictness when the
value ends up never being used and is good hygiene if the space
leak only occurs when 'b' is evaluated but not 'a'.
- Use 'return $! a' when you mean to say, "Eventually evaluate this
thunk to HNF, but if you have other thunks which you need to
evaluate to HNF, it's OK to do those first." In particular,
(return $! a) >> (return $! b) === a `seq` (return $! b)
=== a `seq` b `seq` return b
=== b `seq` a `seq` return b [1]
This situation is similar for 'a `seq` return ()' and 'a `seq` m'.
Avoid using this form in IO; empirically, you're far more likely
to run into stupid interactions with the optimizer, and when later
monadic values maybe bottoms, the optimizer will be justified in
its choice. Prefer using this form when you don't care about
ordering, or if you don't mind thunks not getting evaluated when
bottoms show up. For non-IO monads, since everything is imprecise
anyway, it doesn't matter.
- Use 'pseq' only when 'par' is involved.
Edward
More information about the Haskell-Cafe
mailing list