[Haskell-cafe] What unsafeInterleaveIO is unsafe

Sun Mar 15 19:16:26 EDT 2009

>> main = do
>>     r <- newIORef 0
>>     v <- unsafeInterleaveIO $ do
>>         writeIORef r 1
>>         return 1
>>     x <- case f v of
>>         0 -> return 0
>>         n -> return (n - 1)
>>     y <- readIORef r
>>     print y
>>
>> -- a couple of examples:
>> f x = 0 -- program prints "0"
>> -- f x = x -- program prints "1"
> 
> "f" is pure.  But if f is nonstrict, this program prints 0, and if
> it's strict, it prints 1.  The strictness of a pure function can
> change the observable behavior of your program!

Strictness is an effect, even if it isn't recorded in Haskell's types.
Replacing 'v <- ..' by 'let v = undefined' provides similar choices.

'unsafeInterleaveIO' explicitly uses the "implicit" effects of
strictness to drive the "explicit" effects of IO, which is why it
is unsafe (moving IO effects from explicit to implicit). But even 
if 'unsafeInterleaveIO' where eliminated, strictness/evaluation 
would still remain as an implicit effect in Haskell. Here's a 
variation without 'unsafeInterleaveIO':

import Data.IORef
import Control.Exception

main = do
    r <- newIORef 0
    let v = undefined
    handle (\(ErrorCall _)->print "hi">>return 42) $ case f v of
      0 -> return 0
      n -> return (n - 1)
    y <- readIORef r
    print y

-- a couple of examples:
f x = 0 -- program prints '0'
f x = x -- program prints '"hi"0'

(sideline: I often interpret '_|_::t' not so much as an element of 't'
but as failure to produce information at type 't'; the type tells us
what information we can have, evaluation not only provides the
details, but also decides whether or not we have any of that info,
and how much of it)

> Furthermore, due to the monad laws, if f is total, then reordering the
> (x <- ...) and (y <- ...) parts of the program should have no effect.

    case f v of { 0 -> return 0; n -> return (n - 1) }
=
    return $! case f v of { 0 -> 0; n -> (n - 1) }
=/=
    return $ case f v of { 0 -> 0; n -> (n - 1) }

Monad laws apply to the latter, so how is reordering justified?
It doesn't matter if 'f' is total, the context requires to know whether
'f v' is '0' or not. Since the only thing used from the result is its
successful evaluation, the case could be eliminated, but that still
leaves

    return $! f v
=/=
    return $ f v

> But if you switch them, the program will *always* print 0.

In my copy of the docs, the only things known about 'unsafeInterleaveIO'
are its module, its type, and that is is "unsafe". If we assume, from the name, 
that evaluation of its parameter will be interleaved unsafely with the main IO 
thread, there is no knowing when or if that evaluation will happen. Unless 
there is a dependency on the result of that evaluation, in which case we have
an upper bound on when the evaluation must happen, but still no lower
bound. From the comments in the source code, it appears that lower
and upper bound are intended to be identical, ie. evaluation is supposed
to happen at the latest possible point permitted by dependencies.

Changing dependencies changes the program.

> Also, the compiller might notice that x is never used, and that "f" is
> total.  So it could just optimize out the evaluation of "f v"
> completely, at which point the program always prints 0 again; changing
> optimization settings modifies the result of the program.

It doesn't matter whether or not 'x' is used. It matters whether 'f v' 
needs to be evaluated to get at the 'return' action. Even if 'f' is total,
that evaluation cannot be skipped. Eta-expansion changes strictness,
which changes the program (eg, '\p->(fst p,snd p)' =/= 'id::(a,b)->(a,b)',
even though these functions only apply to pairs, so we "know" that
"whatever 'p' is, it ought to be a pair" - only we don't; and neither do
we know that 'f v' is a number, even if 'f' itself is total).

None of this means that lazy IO and monadic IO ought to be mixed
freely. If a program depends on strictness/non-strictness, that needs 
to be taken into account, which can be troublesome, which is why
lazy IO hasn't been the default IO mechanism in Haskell for many
years. It is still available because when it is applicable, it can be
quite elegant and simple. But we need to decide whether or not
that is the case for each use case, even without the explicit 'unsafe'.

Hth?
Claus