[Haskell-cafe] Just how unsafe is unsafe

Fri Feb 6 04:53:52 EST 2009

Hi,

> My opinion is that unsafeXXX is acceptable only when its use is
> preserved behind an abstraction that is referentially transparent and
> type safe. Others may be able to help refine this statement.

I would agree with this. The problem is that impurity spreads easily.
For example, suppose we have this truly random number generator,
'random'. As soon as we have this, then *almost every* function is
potentially impure:

f :: Integer -> Integer
f x = random + x

g :: [Integer]
g = repeat random

etc. etc. The compiler has no way of tracking impurity other than
through the type system which is, of course, exactly what monads do. 

To echo the sentiment above, the only safe way to use unsafePerformIO is
to hide it behind a function that *is* guaranteed to be pure (i.e.,
returns the same values for the same arguments, can be inlined, etc.).
And even then, I would recommend against it.  Let me give a *practical*
reason why. 

For a long time, I would never even have considered unsafePerformIO, but
I recently had an application that needed unique global identifiers in
lots of places, and I was reluctant to pass state around everywhere;
*but* I could hide it in a few functions, which were themselves pure. It
looked something like:

-- Replace (some) name by a number
quote :: Integer -> Term -> Term

-- Replace a number by (that) name
unquote :: Name -> Term -> Term

-- Typical usage
foo t == let l = getUnsafeUniqueGlobalIdentifier () in
         unquote l . do some stuff . quote l

Since "unquote l . quote l" is an identity operation, 'foo' itself is
pure -- provided that nothing in 'do some stuff' relies on the exact
identity of the identifier. 

--- A rule which I broke at some point, got some very strange behaviour,
and took me ages to debug. This was mostly due to laziness, which made
the point of execution of the unsafe operation to be very difficult to
predict. 

For example, every call to getUnsafeUniqueGlobalIdentifier (it wasn't
actually called that, don't worry :-) yielded a number one higher than
the previous. However, in a list of terms [t1, t2, .., tn] all of which
include some unique idnetifier, it is *not* the generation of the list
that determines whether the identifiers in these terms are incrementing,
but the *evaluation* of the list -- when are the terms forced to normal
form. I was called 'sort' on this list, and sort depended on the values
of these identifiers -- but since sort evaluated the terms in the list
to normal form in a hard to predict order, the order of the list was
anything but sorted! 

--- Moreover, you need all sorts of compiler options or nasty hacks (the
unit argument to getUnsafeUniqueGlobalIdentifier above is no mistake) to
avoid the compiler optimizing your code in ways that you did not expect.

In the end, I ended up rewriting the entire application to avoid the use
of this global unique identifiers, because it was simply too difficult
to get right. I felt I was writing C code again and was chasing bugs due
to dangling pointers and the wrong memory being used. Not a time I want
to return to!

Moral of the story: unless you really really need to and really really
know what you are doing -- do not use unsafePerformIO. Uncontrolled side
effects and lazines will cause extremely hard to track behaviour in your
program, and things are almost guaranteed to go wrong. 

Edsko