[Haskell-cafe] Re: [Haskell] Re: Global Variables and IO
initializers
Benjamin Franksen
benjamin.franksen at bessy.de
Sun Nov 7 20:55:50 EST 2004
[moving to haskell-cafe]
Sorry for the long post.
On Sunday 07 November 2004 22:55, Adrian Hey wrote:
> On Sunday 07 Nov 2004 1:45 pm, Benjamin Franksen wrote:
> > It's a similar advantage as using the IO monad has over allowing
> > arbitrary side-effects in functions: The IO monad gives you a clear
> > separation between stuff that has (side-) effects (i.e. depends on the
> > real word) and pure functions (which don't). Abandoning global variables
> > gives you a clear separation of stuff that depends on initialized state
> > and other stuff that does not depend on it.
>
> I don't agree. Hidden dependencies are a fact of life with stateful
> programming in general and IO monad in particular. Making some
> references explicit arguments (as you seem to be suggesting) does
> not eliminate the problem, it merely complicates an api for no good
> reason.
You have point here: hidden dependencies are something that is inherently
possible in the IO monad. You can for instance easily create global variables
using the FFI without resorting to unsafePerformIO. I'll take back what I
said above. But I maintain that it is a good idea to avoid hiding
dependencies if possible.
> Hiding internal state dependencies is a *good thing*. The trick is
> organise the dependencies and provide a robust "idiot proof" api so
> that users don't have to know about the internal organisation and any
> dependencies.
Oh, but the user *has* to know about them. The user must call the init routine
before using otehr routines of the library, remember? Why are you against the
type checker reminding her?
I know a lot of those "idiot proof" libraries: "You need to call X then call Y
but not if Z was called before..." One of the ideas behind using functions
with arguments and a static type system is to encode dependencies so that the
compiler can enforce them.
And BTW what if your idiot proof initialization routine needs arguments to
configure the library? Is the user still allowed to call it from several
places in his code, now with possibly different arguments? And with what
effect?
> I don't believe this is a new (or controversial) idea.
> Its the basic idea behind stateful modular or OO programming.
> All the user sees is a set of actions which collectively deliver on
> a promise (by unknown means).
OO is the best argument *against* global variables. Pure OO languages have
*no* hidden global state. In every real OO programm you have the dependency
explicit, since you always need a "target" object on which to invoke methods.
It doesn't matter that you write "object.f" instead of "f object" as you
would in Haskell. I have never heard anyone using an OO language complain
about that.
The two best OO languages I know of are Eiffel and O'Haskell/Timber. Both do
not have global variables. Eiffel has 'once' routines which seem similar to
be what you are after. Timber doesn't even have top-level IO actions, instead
everything you need from the environment is given as an argument to main.
Mark that Timber is used for real-time control, an inherently stateful and IO
intensive field.
Your opinion that it automatically leads to a horrible API if you have to pass
the initialized state around amounts to saying that in an OO language like
Eiffel only libraries with horribly inconvenient APIs can be written. This is
ridiculous.
Even in C++ using global variables is nowadays generally regarded as bad
design, especially for libraries.
> > [...] You know that IO actions have (side-) effects, so you
> > would take care that the actions get executed as many times as is
> > apropriate. If the library docs indicate that it makes no sense to call
> > it twice, why would you do so?
>
> Given such a statement about realInit you wouldn't (or to be more precise,
> given a statement that calling it twice or more will really screw things
> up).
I would be really interested to know what kind of init action you are talking
about, that so badly screws everything up if called twice. This is not
rethoric, I mean it.
> But the question is *how* is the user to ensure that it is only called
> once. I see no other way than the darned awkward alternative I gave.
We have an interesting patt situation here: You argue that you want a feature
so that you can enforce that a routine is called *at most* once. I argue that
if you do this by hiding state dependencies, you are loosing the ability to
enforce that it is called *at least* once.
You argue that it might be catastrophic if the library initialized more than
once. I argue that it is usually catastrophic (with this I mean core dump or
at least exception if it is programmed defensively) if you don't initialize
it at all.
> I
> suppose the other alternative is the noddy realInit is only used once in an
> action which is only used once, in an action .. from main (which is only
> used once hopefully). Is this what you have in mind?
It's the same patt as above: If you do it your way, you have the problem with
ensuring that it gets called at least once before you call routines that
depend on it. And that gets *really* hard as soon as you have concurrent
threads.
Maybe we should look for a solution that can enforce *both* invariants, "at
least once" as well as "at most once"? Its only that I can't see such a
solution and therefore my preferences would be to redesign 'realInit' in such
a way that calling it twice is not fatal but just creates another
'instance' (can't be more specific without knowing what the library does).
> The behaviour of (and consequent constraints on correct useage of)
> realInit and putString are very different. Must I eloborate them?
True, I can't see any constraints on correct usage of 'putString' that aren't
enforced by the type checker. And that is exactly how it should be.
Maybe the problem with your 'realInit' is that it needs such constraints?
Again, giving an example might convince me that these constraints are
inherent to the problem domain and can't be worked around.
> > > It doesn't seem very attractive to users either
> > > (considerably complicates their code and places the burden on them to
> > > "get it right").
> >
> > It may seem so at first, but I think it's a delusion.
>
> Trust me on this, for whatever reason, it's absolutely vital that realInit
> is used 0 or 1 times only, 2 or more is a catastrophic error.
I would very much like to trust you, but why can't you give us an example? Are
you talking about misssion-critical stuff like controlling an airplane? But
you don't initialize a library in full flight, do you?
So why is it catastrophic and what exactly does that mean? I thought you mean
core cump, but I am no longer sure...
Maybe the reason is that it calls out to C libraries with a broken API? (I
know of enough such libraries, and interfacing them in a clean manner is
sometimes a pain in the ass.)
> So I'll ask again. Please provide a simpler _and_ safer alternative
> (some real Haskell code please).
And I'll ask again for an example to convince me of the necessity.
> > At the moment I cannot imagine a well designed library interface where
> > user code would be considerably complicated if no global variables were
> > used. But maybe you have a good example at hand to prove that this is
> > merely due to lack of imagination on my side, and that I was extremely
> > lucky with the HWS? ;-)
>
> Indeed, I believe this is the case. I'm guessing of course, but I imagine
> all your IO is done via standard Haskell library calls (socket API
> or whatever), in which case they will hide a lot of the stateful compexity
> of their implementation already.
I don't know about the latter. I do know that there are no constraints on
usage in the form of "this must be called before that", besides the ones
automatically enforced by the type system. An exception might be the posix
libraries, but they are only a thin layer over a badly designed C API. I
could be wrong, but I doubt that there is lots of hidden state in the Haskell
part.
I once wrote a Haskell binding to a C library for a special network protocol.
I never even considered using unsafePerformIO except for C routines that were
actually pure functions. What I *did* need to consider and work around was
that the C API was in some places hiding global state, which was *very* bad.
Another example: Have you ever been using ONC/RPC (Remote Procedure Call)? I
saw implementations that came with a real-time multithreaded OS where the
docs said, more or less: "All created objects such as client handle may only
be used from the thread that created them." *That* is a horrible API, because
it means you can not pass these objects around freely but have to make sure
your routine isn't called from the "wrong" thread! And the reason for this
restriction was (of course) that the library was hiding state inside
thread-local variables.
> If so it seems to me you're using the fact
> that somebody has already solved the problem for you as an argument that
> no solution is necessary.
Maybe. We can talk in "if" sentences until we both die of old age.
> (It would be interesting to see what the api's
> of the libraries you're using would look like, if they had been designed
> according to the principles you're advocating).
Yes, that would be interesting. And it is not a matter of me holding up holy
principles against an evil reality. I am talking about practical
considerations, not ideals. I hope I've made that clear with the above
examples.
Cheers,
Ben
More information about the Haskell-Cafe
mailing list