[Haskell-cafe] What unsafeInterleaveIO is unsafe
wren ng thornton
wren at freegeek.org
Sun Mar 15 21:04:36 EDT 2009
Yusaku Hashimoto wrote:
> I was studying about what unsafeInterleaveIO is.I understood
> unsafeInterleaveIO takes an IO action, and delays it. But I couldn't
> find any reason why unsafeInterleaveIO is unsafe.
> I have already read an example in
> says lazy IO may break purity, but I think real matter in this example
> are wrong use of seq. did I misread?
For example: I have some universal state in IO. We'll call it an IORef,
but it could be anything, like reading lines from a file. And I have
some method for accessing and updating that state.
> next r = do n <- readIORef r
> writeIORef r (n+1)
> return n
Now, if I use unsafeInterleaveIO:
> main = do r <- newIORef 0
> x <- do a <- unsafeInterleaveIO (next r)
> b <- unsafeInterleaveIO (next r)
> return (a,b)
The values of a and b in x are entirely arbitrary, and are only set at
the point when they are first accessed. They're not just arbitrary
between which is 0 and which is 1, they could be *any* pair of values
(other than equal) since the reference r is still in scope and other
code in the ... could affect it before we access a and b, or between the
The arbitrariness is not "random" in the statistical sense, but rather
is an oracle for determining the order in which evaluation has occurred.
Consider, as an illustration these two alternatives for the ...:
> fst x `seq` snd x `seq` return x
> snd x `seq` fst x `seq` return x
In this example, main will return (0,1) or (1,0) depending on which was
chosen. You are right in that the issue lies in seq, but that's a red
herring. Having made x, we can pass it along to any function, ignore the
output of that function, and inspect x in order to know the order of
strictness in that function.
Moreover, let's have two pure implementations, f and g, of the same
mathematical function. Even if f and g are close enough to correctly
give the same output for inputs with _|_ in them, we may be able to
observe the fact that they arrive at those answers differently by
passing in our x. Given that such observations are possible, it is no
longer safe to exchange f and g for one another, despite the fact that
they are pure and give the same output for all (meaningful) inputs.
This example is somewhat artificial because we set up x to use
unsafeInterleaveIO in the bad way. For the intended use cases where it
is indeed (arguably) safe, we would need to be sure to manually thread
the state through the pure value (e.g. x) such that the final value is
sane. For instance, in lazy I/O where we're constructing a list of
lines/bytes/whatever, we need to ensure that any access to the Nth
element of the list will first force the (N-1)th element, so that we
ensure that the list comes out in the same order as if we forced all of
them at construction time.
For things like arbitrary symbol generation, unsafeInterleaveIO is
perfectly fine because the order and identity of the symbols generated
is irrelevant, but more importantly it is safe because the "IO" that's
going on is not actually I/O. For arbitrary symbol generation, we could
use unsafeInterleaveST instead, and that would be better because it
accurately describes the effects. For any IO value which has real I/O
effects, unsafeInterleaveIO is almost never correct because the ordering
of effects on the real world (or whether the effects occur at all)
depends entirely on the evaluation behavior of the program, which can
vary by compiler, by compiler version, or even between different runs of
the same compiled binary.
More information about the Haskell-Cafe