[GHC] #12607: Memory effects in doomed STM transactions
GHC
ghc-devs at haskell.org
Wed Sep 21 13:39:28 UTC 2016
#12607: Memory effects in doomed STM transactions
-------------------------------------+-------------------------------------
Reporter: fryguybob | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Runtime | Version: 8.0.1
System |
Keywords: STM, allocate | Operating System: Unknown/Multiple
Architecture: | Type of failure: None/Unknown
Unknown/Multiple |
Test Case: | Blocked By:
Blocking: | Related Tickets:
Differential Rev(s): | Wiki Page:
-------------------------------------+-------------------------------------
== Problem ==
GHC's STM implementation does "lazy" validation meaning that speculative
execution can continue even after observing an inconsistent view of
memory. Transactions in the state are called "doomed transactions".
While the runtime tries to avoids some effects in doomed transactions,
such as entering an infinite loop, it does not successfully control all
memory effects. In particular there are many (I do not have an exhaustive
list) allocations that eventually end up in the `allocate` function which
is nicely documented with the following:
> `allocate()` never
> fails; if it returns, the returned value is a valid address. If
> the nursery is already full, then another block is allocated from
> the global block pool. If we need to get memory from the OS and
> that operation fails, then the whole process will be killed.
https://github.com/ghc/ghc/blob/master/rts/sm/Storage.c#L819
A doomed transaction that demands a large allocation that the OS cannot
fulfill will kill the process. Here is a program that reliably fails for
me:
{{{#!hs
-- oom.hs
{-# LANGUAGE BangPatterns #-}
import GHC.Conc
import Control.Concurrent
import qualified Data.ByteString as B
forever act = act >> forever act
check True = return ()
check False = retry
main = do
tx <- newTVarIO 0
ty <- newTVarIO 1
tz <- newTVarIO 0
done <- newTVarIO False
_ <- forkIO $ forever $ atomically $ do
-- Only read tx and ty
x <- readTVar tx
y <- readTVar ty
if x > y -- This should always be False
then do
-- We only get here in a doomed transaction. Commenting out
the next two lines
-- but leaving the write to done and the program runs as
expected because the
-- doomed transaction is detected at commit time.
let !big = B.length $ B.replicate maxBound 0 -- The big
allocation!
writeTVar tz big
writeTVar done True
else return ()
let mut = forever $ atomically $ do
y <- readTVar ty
x <- readTVar tx
if x > 1000
then do
-- When we get big enough, start over.
writeTVar tx 0
writeTVar ty 1 -- Always holds semantically that tx < ty
else do
-- increment x and y both
writeTVar ty (succ y)
writeTVar tx (succ x) -- tx < ty
-- Give lots of opportunities to witness inconsistent memory.
_ <- forkIO mut
_ <- forkIO mut
_ <- forkIO mut
atomically $ readTVar done >>= check
putStrLn "Done"
}}}
Running leads to out of memory:
{{{#!bash
> ghc oom.hs -fno-omit-yields -threaded
[1 of 1] Compiling Main ( oom.hs, oom.o )
Linking oom.exe ...
> ./oom.exe +RTS -N
oom.exe: out of memory
}}}
== Fixing ==
When a potentially bad memory effect is about to happen in some thread, we
need to ensure that we validate any running transaction and if it fails
have some way to back out of the operation and abort the transaction.
This might be tricky on two fronts 1) finding all the critical allocations
and 2) finding places where we can both detect (1) and abort the
transaction. The places I'm concerned about most are array allocation
such as the `ByteString` in the example and maybe `Integer` allocation.
Another fix is a different STM implementation altogether that (at a
performance cost or trade-off) doesn't allow doomed transactions (to
appear Haskell Symposium 2016 :D). I think we can at least address this
particular example without going so far.
A related issue is more general memory leaks from doomed transactions.
Consider a program with a memoized Fibonacci function. The semantics of
the transactions written by the programmer may ensure that no value higher
then 10 is ever demanded of that function, yet in a doomed transaction the
invariant goes wrong and the program demands 100. The program will detect
the doomed transaction eventually in this case, but not before it has
allocated a large live blob never to be touched again.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/12607>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list