Understanding behavior of BlockedIndefinitelyOnMVar exception

Edward Z. Yang ezyang at MIT.EDU
Tue Jul 26 07:25:17 CEST 2011


Hello Brandon,

The answer is subtle, and has to do with what references are kept in code,
which make an object considered reachable.  Essentially, the main thread
itself keeps the MVar live while it still has forking to do, so that
it cannot get garbage collected and trigger these errors.

Here is a simple demonstrative program:

    main = do
        lock <- newMVar ()
        forkIO (takeMVar lock)
        forkIO (takeMVar lock)
        forkIO (takeMVar lock)

Consider what the underlying code needs to do after it has performed
the first forkIO.  'lock' is a local variable that the code generator
knows it's going to need later in the function body. So what does it
do? It saves it on the stack.

    // R1 is a pointer to the MVar
    cqo:
        Hp = Hp + 8;
        if (Hp > HpLim) goto cqq;
        I32[Hp - 4] = spd_info;
        I32[Hp + 0] = R1;
        I32[Sp + 0] = R1;
        R1 = Hp - 3;
        I32[Sp - 4] = spe_info;
        Sp = Sp - 4;
        jump stg_forkzh ();

(Ignore the Hp > HpLim; that's just the heap check.)

This lives on until we continue executing the main thread at spe_info
(at which point we may or may not deallocate the stack frame).  But what
happens instead?

    cqk:
        Hp = Hp + 8;
        if (Hp > HpLim) goto cqm;
        I32[Hp - 4] = sph_info;
        I32[Hp + 0] = I32[Sp + 4];
        R1 = Hp - 3;
        I32[Sp + 0] = spi_info;
        jump stg_forkzh ();

We keep the pointer to the MVar to the stack, because we know there
is yet /another/ forkIO (takeMVar lock) coming up. (It's located at
Sp + 4; you have to squint a little since Sp is being fiddled
with, but it's still there, we just overwrite the infotable with
a new one.)

Finally, spi_info decides we don't need the contents of Sp + 4 anymore,
and overwrites it accordingly:

    cqg:
        Hp = Hp + 8;
        if (Hp > HpLim) goto cqi;
        I32[Hp - 4] = spl_info;
        I32[Hp + 0] = I32[Sp + 4];
        R1 = Hp - 3;
        I32[Sp + 4] = spm_info;
        Sp = Sp + 4;
        jump stg_forkzh ();

But in the meantime (esp. between invocation 2 and 3), the MVar cannot be
garbage collected, because it is live on the stack.

Could GHC have been more clever in this case?  Not in general, since deciding
whether or not a reference will actually be used or not boils down to the
halting problem.

    loop = threadDelay 100 >> loop -- prevent blackholing from discovering this
    main = do
        lock <- newEmptyMVar
        t1 <- newEmptyMVar
        forkIO (takeMVar lock >> putMVar t1 ())
        forkIO (loop `finally` putMVar lock ())
        takeMVar t1

Maybe we could do something where MVar references are known to be writer ends
or read ends, and let the garbage collector know that an MVar with only read
ends left is a deadlocked one.  However, this would be a very imprecise
analysis, and would not help in your original code (since all of your remaining
threads had the possibility of writing to the MVar: it doesn't become clear
that they can't until they all hit their takeMVar statements.)

Cheers,
Edward



More information about the Glasgow-haskell-users mailing list