[Haskell] Lazy IO breaks purity

Thu Mar 5 03:14:20 EST 2009

I don't see any breaking of referential transparence in your code.
Every time you do an IO operation the result is basically
non-deterministic since you are talking to the outside world.
You're assuming the IO has some kind of semantics that Haskell makes
no promises about.

I'm not saying that this isn't a problem, because it is.
But it doesn't break referential transparency, it just makes the
semantics of IO even more complicated.

(I don't have a formal proof that unsafeInterleaveIO cannot break RT,
but I've not seen an example where it does yet.)

  -- Lennart

On Thu, Mar 5, 2009 at 2:12 AM,  <oleg at okmij.org> wrote:
>
>
> We demonstrate how lazy IO breaks referential transparency.  A pure
> function of the type Int->Int->Int gives different integers depending
> on the order of evaluation of its arguments. Our Haskell98 code uses
> nothing but the standard input.  We conclude that extolling the purity
> of Haskell and advertising lazy IO is inconsistent.
>
> Henning Thielemann wrote on Haskell-Cafe:
>> Say I have a chain of functions: read a file, ... write that to a file,
>> all of these functions are written in a lazy way, which is currently
>> considered good style
>
> Lazy IO should not be considered good style. One of the common
> definitions of purity is that pure expressions should evaluate to the
> same results regardless of evaluation order, or that equals can be
> substituted for equals. If an expression of the type Int evaluates to
> 1, we should be able to replace every occurrence of the expression with
> 1 without changing the results and other observables.
>
> The expression (read s) where s is the result of the (hGetContents h1)
> action has the pure type, e.g., Int. Yet its evaluation does more than
> just producing an integer: it may also close a file associated with
> the handle h1. Closing the file has an observable effect: the file
> descriptor, which is a scarce resource, is freed. Some OS lock the
> open file, or prevent operations such as renaming and deletion on the
> open file. A file system whose file is open cannot be
> unmounted. Suppose I use an Haskell application such as an editor to
> process data from a removable drive. I mount the drive, tell the
> application the file name. The (still running) application finished
> with the file and displayed the result. And yet I can't unplug the
> removable drive, because it turns out that the final result was
> produced without needing to read all the data from the file, and the
> file remains unclosed.
>
> Some people say: the programmer should have used seq. That is the
> admission of guilt! An expression (read s)::Int that evaluates to 1 is
> equal to 1. So, one should be able substitute (read s) with 1. If the
> result of evaluating 1 turns out not needed for the final outcome,
> then not evaluating (read s) should not affect the outcome. And yet it
> does. One uses seq to force evaluation of an expression even if the
> result may be unused. Thus the expression of a pure type
> has *side-effect*.
>
> The advice about using seq reminds me of advice given to C programmers
> that they should not free a malloc'ed pointer twice, dereference a
> zero pointer and write past the boundary of an array. If lazy IO is
> considered good style given the proper use of seq, then C should be
> considered safe given the proper use of pointers and arrays.
>
> Side affects like closing a file are observable in the external
> world. A program may observe these effects, in an IO monad. One can
> argue there are no guarantees at all about what happens in the IO
> monad. Can side-effects of lazy IO be observable in _pure_ Haskell
> code, in functions with pure type? Yes, they can. The examples are
> depressingly easy to write, once one realizes that reading does have
> side effects, POSIX provides for file descriptor aliasing, and sharing
> becomes observable with side effects. Here is a simple Haskell98 code.
>
>
>> {- Haskell98! -}
>>
>> module Main where
>>
>> import System.IO
>> import System.Posix.IO (fdToHandle, stdInput)
>>
>> -- f1 and f2 are both pure functions, with the pure type.
>> -- Both compute the result of the subtraction e1 - e2.
>> -- The only difference between them is the sequence of
>> -- evaluating their arguments, e1 `seq` e2 vs. e2 `seq` e1
>> -- For really pure functions, that difference should not be observable
>>
>> f1, f2:: Int -> Int -> Int
>>
>> f1 e1 e2 = e1 `seq` e2 `seq` e1 - e2
>> f2 e1 e2 = e2 `seq` e1 `seq` e1 - e2
>>
>> read_int s = read . head . words $ s
>>
>> main = do
>>        let h1 = stdin
>>        h2 <- fdToHandle stdInput
>>        s1 <- hGetContents h1
>>        s2 <- hGetContents h2
>>        print $ f1 (read_int s1) (read_int s2)
>>        -- print $ f2 (read_int s1) (read_int s2)
>
>
> One can compile it with GHC (I used 6.8.2, with and without -O2) and
> run like this
>   ~> /tmp/a.out
>   1
>   2
>   -1
> That is, we run the program and type 1 and 2 as the inputs. The
> program prints out the result, -1. If we comment out the line
> "print $ f1 (read_int s1) (read_int s2)" and uncomment out the last
> line the transcript looks like this
>   ~> /tmp/a.out
>   1
>   2
>   1
>
> Running the code with Hugs exhibits the same behavior. Thus a pure
> function (-) of the pure type gives different results depending on the
> order of evaluating its arguments.
>
>        Is 1 = -1?
>
>
>
>
>
>
> _______________________________________________
> Haskell mailing list
> Haskell at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell
>