Optimizations for mutable structures?
Jan-Willem Maessen
jmaessen at alum.mit.edu
Wed Dec 7 23:09:17 EST 2005
Oh, dear, a brief mail with a high-level view of optimization seems
to have touched off some confusion about concurrency. I wasn't
thinking about concurrency at all when I wrote my original message,
but there seem to be some major misconceptions in the ensuing
discussion, which I'd like to clear up.
On Dec 7, 2005, at 9:15 AM, Simon Marlow wrote:
> On 07 December 2005 13:38, Malcolm Wallace wrote:
>
>> Jan-Willem Maessen <jmaessen at alum.mit.edu> writes:
>>
>>> - Fetch elimination for imperative reads:
>>> writeIORef r e >> acts >> readIORef r
>>> === writeIORef r e >> acts >> return e
>>
>> This transformation is valid only on single-threaded systems.
>> If there is any possibility of an IORef being shared across threads,
>> you are out of luck.
>
> (assuming 'acts' doesn't modify 'r').
This was my intended meaning---I dashed off the above lines, and
trusted they'd be understood. Apparently I should have been clearer.
> No, Jan's transformation is correct even in a multithreaded
> setting. It
> might eliminate some possible outcomes from a non-deterministic
> program,
> but that's ok.
I agree with Simon here. "Eliminate some possible outcomes" is
indeed possible to define in a meaningful and technically rigorous
way. Others (I'll single out Malcolm and Claus here) have railed
against this view, arguing that every program behavior should be
preserved. Claus even raises the (difficult technical) issue of
fairness.
Sadly, I'm afraid any realistic implementation *will* eliminate
possible outcomes. In its cooperative multithreading, GHC constantly
makes decisions about where thread switching may occur---and I
suspect that inserting every possible thread switch would result in
dramatic performance drops.
But worse than that, even if we insert a check between every single
IORef operation, the processor may well decide to execute successive
IORef operations in parallel, or even choose to reorder two IORef
operations which refer to different locations. It may even choose to
eliminate the read in the above example before the write has become
visible to any other thread! In effect we get behavior
indistinguishable from the suggested optimizations.
Now, such behavior isn't always acceptable, so there are ways to get
back to sanity. However, those ways (memory barriers and/or atomic
memory operations) are extremely expensive. I'm betting one or both
of you regularly use an x86 machine---for which there is not even a
rigorous specification of where these operations must be inserted!
Nonetheless, we get by. We do so by defining idioms based on
synchronization---MVars and TMVars are entirely appropriate places to
be enforcing memory ordering. Much of the early pain of the Java
Memory Model revision (Claus referred to the mess which was made of
the original spec, now fixed) was to avoid the need to insert
barriers in most code. A consensus was reached on an acceptable
programming style: Use monitor synchronization and avoid data races,
or failing that use volatile variables in particular well-defined
ways. If you break those rules, all bets are off; there is a lengthy
spec defining exactly what that means (mostly to rule out the
behavior "then the program creates a valid password out of thin
air"), but this is everything you, the programmer, need to understand.
Similar consensus opinions have formed in other communities, usually
without rigorous specifications to back them up. Voices in this
thread have suggested that the right idiom for Haskell is to
synchronize using TMVars and MVars. I agree (actually I hope that
TMVars are enough, though I'd love to have a TMArray), and I think we
can even guarantee reasonable things about IORefs that get passed
from thread to thread in this way. Want fairness? Put the stuff you
want to observe in an MVar or a TMVar.
Where does this leave us? Roughly where Simon noted---it's easy to
do shallow things, like fetch elimination in the absence of
intervening calls to unknown functions, and hard to do deep
optimizations. I suspect that shallow things are all that is called
for.
We love to criticize C for all the things it has done badly. But
let's not assume that everything C compilers do is stupid or a bad
idea. [Oddly, I find myself telling C programmers the same thing
about Fortran all the time.]
-Jan-Willem Maessen
> There's no requirement that all interleavings according
> to the semantics have to be implemented. This is a hard property to
> state precisely, indeed we gave up trying to in the concurrency/FFI
> paper: http://www.haskell.org/~simonmar/papers/conc-ffi.pdf, see
> Section
> 6.1.
>
> Cheers,
> Simon
> _______________________________________________
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users at haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
More information about the Glasgow-haskell-users
mailing list