[commit: ghc] wip/dfeuer-interleave-null: Speed up unsafeInterleaveIO (f2851e1)

git at git.haskell.org git at git.haskell.org
Wed May 3 18:36:01 UTC 2017


Repository : ssh://git@git.haskell.org/ghc

On branch  : wip/dfeuer-interleave-null
Link       : http://ghc.haskell.org/trac/ghc/changeset/f2851e13ae66a38dedec6d7c59aca3fe99bfb817/ghc

>---------------------------------------------------------------

commit f2851e13ae66a38dedec6d7c59aca3fe99bfb817
Author: David Feuer <David.Feuer at gmail.com>
Date:   Tue May 2 17:09:00 2017 -0400

    Speed up unsafeInterleaveIO
    
    Summary:
    Use an `MVar` and a "null pointer" trick I learned from Edward
    Kmett to try to make `unsafeInterleaveIO` faster in the threaded
    runtime, where `noDuplicate#` is not always cheap.
    
    Reviewers: austin, hvr, bgamari
    
    Subscribers: rwbarton, thomie
    
    Differential Revision: https://phabricator.haskell.org/D3529


>---------------------------------------------------------------

f2851e13ae66a38dedec6d7c59aca3fe99bfb817
 libraries/base/GHC/IO/Unsafe.hs | 77 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 74 insertions(+), 3 deletions(-)

diff --git a/libraries/base/GHC/IO/Unsafe.hs b/libraries/base/GHC/IO/Unsafe.hs
index c1c07ae..698b9fd 100644
--- a/libraries/base/GHC/IO/Unsafe.hs
+++ b/libraries/base/GHC/IO/Unsafe.hs
@@ -26,7 +26,7 @@ module GHC.IO.Unsafe (
   ) where
 
 import GHC.Base
-
+import GHC.MVar
 
 {-|
 This is the \"back door\" into the 'IO' monad, allowing
@@ -111,12 +111,83 @@ file reading, see 'System.IO.hGetContents'.
 -}
 {-# INLINE unsafeInterleaveIO #-}
 unsafeInterleaveIO :: IO a -> IO a
-unsafeInterleaveIO m = unsafeDupableInterleaveIO (noDuplicate >> m)
+-- See Note [Null pointers in unsafeInterleaveIO]
+unsafeInterleaveIO m = do
+  v <- case unclaimed of
+         Box r -> unsafeCoerce# newMVar r
+  unsafeDupableInterleaveIO $ do
+  a <- takeMVar v
+  if isUnclaimed a
+    then do
+      res <- m
+      putMVar v res
+      pure res
+    else a <$ putMVar v a
+
+-- The 'Unclaimed' constructor must not be exported.
+data Unclaimed = Unclaimed
+data Box = Box !Unclaimed
+
+-- We use 'unclaimed' as a "null pointer" in 'unsafeInterleaveIO'.
+-- It must not be exported!
+-- See Note [Null pointers in unsafeInterleaveIO]
+{-# NOINLINE unclaimed #-}
+unclaimed :: Box
+unclaimed = Box Unclaimed
+
+isUnclaimed :: a -> Bool
+isUnclaimed a = case unclaimed of
+  Box r -> isTrue# (unsafeCoerce# reallyUnsafePtrEquality# a r)
+
+-- Note [Null pointers in unsafeInterleaveIO]
+-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+--
+-- Conceptually, we're implementing this:
+--
+-- unsafeInterleaveIO :: IO a -> IO a
+-- unsafeInterleaveIO m = do
+--   v <- newMVar Nothing
+--   unsafeDupableInterleaveIO $ do
+--     r <- takeMVar v
+--     case r of
+--       -- We're the first ones to get the MVar, so we actually
+--       -- do the work.
+--       Nothing -> do
+--         a <- m
+--         putMVar v (Just a)
+--         pure a
+--
+--       -- Someone else has claimed the action, so we use
+--       -- their result and put it back in the MVar.
+--       j@(Just a) -> a <$ putMVar v j
+--
+-- The MVar starts out full, with Nothing in it. When the interleaved
+-- computation is complete, the result will be stored in the MVar in a Just
+-- constructor. The interleaved computation, which may run in multiple
+-- threads, takes the MVar, checks whether it's Nothing or Just, and either
+-- performs the interleaved computation or just puts the Just back.
+--
+-- However, allocating Just constructors is wasteful; we can pretend we're
+-- writing in C and use a distinguished "null pointer" to represent Nothing
+-- instead. We magic up a single, global null pointer and use that every time.
+-- The usual problem with null pointers is that they can't distinguish, among
+-- Nothing, Just Nothing, Just (Just Nothing), etc. Fortunately, we don't have
+-- to worry about that here. The null pointer is private to this module, so
+-- it is impossible for the computation passed to 'unsafeInterleaveIO' to
+-- produce it.
+--
+-- Why do we have to build a box around the distinguished null? I don't
+-- actually know. But without this box, 'reallyUnsafePtrEquality#' does not
+-- seem to detect equality! Note that we rely on the fact that GHC uses
+-- distinct heap locations to represent nullary constructors of distinct
+-- datatypes. If this changes, we can recover the correct behavior by using
+-- 'unsafePerformIO' to allocate something like an 'IORef' and use the
+-- embedded 'MutVar#' as a null pointer.
 
 -- Note [unsafeDupableInterleaveIO should not be inlined]
 -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 --
--- We used to believe that INLINE on unsafeInterleaveIO was safe,
+-- We used to believe that INLINE on unsafeDupableInterleaveIO was safe,
 -- because the state from this IO thread is passed explicitly to the
 -- interleaved IO, so it cannot be floated out and shared.
 --



More information about the ghc-commits mailing list