Lazy ST vs concurrency

David Feuer david at
Tue Jan 31 17:06:07 UTC 2017

I think Ben's eagerlyBlackhole is what I called noDup. And it is indeed a bit tricky to use 
correctly. It needs the same sort of care around inlining that unsafePerformIO does. A 
bit of summary:

1. It's okay to duplicate a runST thunk or to enter it twice, because each copy will have 
its own set of references and arrays. This is important, because we have absolutely no 
control over what the user will do with such a thunk.

2. With the exception of a runST thunk, we must never duplicate or double-enter a 
thunk if it performs or suspends ST work.

Based on my tests and the intuition I've developed about what's going on here, (2) 
breaks down into two pieces:

2a. Any time we perform or suspend ST work, we must use NOINLINE to avoid 

2b. Any time we suspend ST work, we must set up the thunk involved with noDuplicate# 
or similar.

For example, the code I wrote yesterday for the Applicative instance looks like this:

    fm <*> xm = ST $ \ s ->
         {-# NOINLINE res1 #-}
         !res1 = unST fm s
         !(f, s') = res1

         {-# NOINLINE res2 #-}
         res2 = noDup (unST xm s')
         (x, s'') = res2
       in (f x, s'')

I NOINLINE res1. If it were to inline (could that happen?), we'd get

    res2 = noDup (unST xm (snd (unST fm s)))
    (x, s'') = res2
  in (fst (unST fm s) x, s')

and that would run the fm computation twice. But I don't noDup res1, because we force 
it immediately on creation; no one else ever handles it.

I NOINLINE res2 for a similar reason, but I also use noDup on it. The res2 thunk escapes 
into the wild via x and s'' in the result; we need to make sure that it is not entered twice.

I believe can use a few rewrite rules to reduce costs substantially in some situations. I 
will add those to the differential. 

-------- Original message --------
From: Simon Marlow <marlowsd at> 
Date: 1/31/17 3:59 AM (GMT-05:00) 
To: Simon Peyton Jones <simonpj at> 
Cc: David Feuer <david at>, ghc-devs at 
Subject: Re: Lazy ST vs concurrency 

On 30 January 2017 at 22:56, Simon Peyton Jones <simonpj at[1]> wrote:

We don’t want to do this on a per-module basis do we, as -fatomic-eager-blackholing 
would suggest.  Rather, on per-thunk basis, no?  Which thunks, precisely?   I think 
perhaps *precisely thunks one of whose free variables has type (Sttate# s) for some s.*  
These are thunks that consume a state token, and must do so no more than once.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the ghc-devs mailing list