The IO sin bin [was: Re: [Haskell-cafe] Re: [Haskell] Top Level <-]

Fri Sep 5 00:30:47 EDT 2008

Adrian Hey wrote:
> There's shed loads of information and semantic subtleties about pretty
> much any operation you care to think of in the IO monad that isn't
> communicated by it's type. All you know for sure is that it's weird,
> because if it wasn't it wouldn't be in the IO monad.
> 
> So I think you're applying double standards.

Not to throw any more fuel on the fire (if at all possible), but the 
reason behind this is that IO has become a sin bin for all the things 
that people either don't know how to deal with, or don't care enough to 
tease apart. There are many people who would like to break IO apart into 
separate segments for all the different fragments of the RealWorld that 
actually matter for a given purpose. To date it has not been entirely 
clear how best to do this and retain a workable language.

The fact that this discussion is going on at all is, IMO, precisely 
because of the sin-bin nature of IO. People have things they want to 
have "global" or otherwise arbitrarily large scope, but the only notion 
of a globe in Haskell is the RealWorld. Hence they throw things into IO 
and then unsafePerformIO it to get it back out. There are three problems 
to this approach:

(1) It's a hack and not guaranteed to work, nuff said.

(2) The RealWorld is insufficiently described to ensure any semantics 
regarding *how* it holds onto the state requested of it. This problem 
manifests itself in the discussion of loading the same library multiple 
times, having multiple RTSes in a single OS process, etc. In those 
scenarios what exactly the "RealWorld" is and how the baton is passed 
among the different libraries/threads/processes/RTSes is not clearly 
specified.

(3) The API language is insufficiently detailed to make demands on what 
the RealWorld holds. This problem manifests itself in the argument about 
whether libraries should be allowed to implicitly modify portions of the 
RealWorld, or whether this requirement should be made clear in the type 
signatures of the library.

As I said in the thread on [Research language vs. professional 
language], I think coming up with a solution to this issue is still an 
open research problem, and one I think Haskell should be exploring.

The ACIO monad has a number of nice properties and I think it should be 
broken out from IO even if top-level <- aren't added to the language. 
The ability to declare certain FFI calls as ACIO rather than IO is, I 
think, reason enough to pursue ACIO on its own. But, ACIO does not solve 
the dilemmas raised in #2 and #3. Top-level mutable state is only a 
symptom of the real problem that IO and the RealWorld are insufficiently 
described.

Another example where unsafePerformIO is used often is when doing RTS 
introspection. Frequently, interrogating the RTS has ACIO-like 
properties in that we are only interested in the RTS if a particular 
thunk happens to get pulled on, and we're only interested at the time 
that the pulling occurs rather than in sequence to any other actions. 
The use of unsafePerformIO here seems fraught with all the same problems 
as top-level mutable state. It would be nice to break out an RTS monad 
(or an UnsafeGhcRTS monad, or what have you) in order to be more clear 
about the exact requirements of what's going on.

But even if we break ACIO and UnsafeGhcRTS out from IO, the dilemmas 
remain. To a certain extent, the dilemmas will always remain because 
there will always be a frontier beyond which we don't know what's 
happening: the real world exists, afterall. However, there is still room 
to hope for a general approach to the problem.

One potential is to follow on the coattails of _Data Types a la Carte_. 
Consider, for example, if the language provided a mechanism for users to 
generate RealWorld-like non-existent tokens. Now consider removing IO[1] 
and only using BS, where the thread-state parameter could be any 
(co)product of RealWorld-like tokens. We could then have an overloaded 
function to lift any (BS a) into a BS (a :+: b). There are some 
complications with DTalC's coproducts in practice. For example, (a :+: 
b) and (b :+: a) aren't considered the same type, as naturally they 
can't be due to the Inl/Inr tagging. A similar approach should be able 
to work however, since these tokens don't really exist at all.

Of course, once we take things to that level we're already skirting 
around capability systems. Rather than using an ad-hoc approach like 
this, I think it would be better to work out a theory connecting 
capability systems to monad combinators, and then use that theory to 
smash the sin bin of IO.

[1] Or leaving it in as a type alias for BS RealWorld.

-- 
Live well,
~wren