[commit: ghc] master: base: Make Foreign.Marshal.Alloc.allocBytes[Aligned] NOINLINE (56590db)

git at git.haskell.org git at git.haskell.org
Mon Jul 30 22:00:14 UTC 2018


Repository : ssh://git@git.haskell.org/ghc

On branch  : master
Link       : http://ghc.haskell.org/trac/ghc/changeset/56590db07a776ce81eb89d4a4d86bd0f953fb44e/ghc

>---------------------------------------------------------------

commit 56590db07a776ce81eb89d4a4d86bd0f953fb44e
Author: Ben Gamari <ben at smart-cactus.org>
Date:   Tue Oct 24 12:19:08 2017 -0400

    base: Make Foreign.Marshal.Alloc.allocBytes[Aligned] NOINLINE
    
    As noted in #14346, touch# may be optimized away when the simplifier can see
    that the continuation passed to allocaBytes will not return. Marking CPS-style
    functions with NOINLINE ensures that the simplier can't draw any unsound
    conclusions.
    
    Ultimately the right solution here will be to do away with touch# and instead
    introduce a scoped primitive as is suggested in #14375.
    
    Note: This was present in 8.2 but was never merged to 8.4 in hopes that
    we would have #14375 implemented in time. This meant that the issue
    regressed again in 8.4. Thankfully we caught it in time to fix it for
    8.6.
    
    (cherry picked from commit 404bf05ed3193e918875cd2f6c95ae0da5989be2)


>---------------------------------------------------------------

56590db07a776ce81eb89d4a4d86bd0f953fb44e
 libraries/base/Foreign/Marshal/Alloc.hs | 17 +++++++++++++++++
 testsuite/tests/perf/should_run/all.T   |  3 ++-
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/libraries/base/Foreign/Marshal/Alloc.hs b/libraries/base/Foreign/Marshal/Alloc.hs
index 48ed7fb..c32f0b6 100644
--- a/libraries/base/Foreign/Marshal/Alloc.hs
+++ b/libraries/base/Foreign/Marshal/Alloc.hs
@@ -116,6 +116,19 @@ alloca :: forall a b . Storable a => (Ptr a -> IO b) -> IO b
 alloca  =
   allocaBytesAligned (sizeOf (undefined :: a)) (alignment (undefined :: a))
 
+-- Note [NOINLINE for touch#]
+-- ~~~~~~~~~~~~~~~~~~~~~~~~~~
+-- Both allocaBytes and allocaBytesAligned use the touch#, which is notoriously
+-- fragile in the presence of simplification (see #14346). In particular, the
+-- simplifier may drop the continuation containing the touch# if it can prove
+-- that the action passed to allocaBytes will not return. The hack introduced to
+-- fix this for 8.2.2 is to mark allocaBytes as NOINLINE, ensuring that the
+-- simplifier can't see the divergence.
+--
+-- These can be removed once #14375 is fixed, which suggests that we instead do
+-- away with touch# in favor of a primitive that will capture the scoping left
+-- implicit in the case of touch#.
+
 -- |@'allocaBytes' n f@ executes the computation @f@, passing as argument
 -- a pointer to a temporarily allocated block of memory of @n@ bytes.
 -- The block of memory is sufficiently aligned for any of the basic
@@ -134,6 +147,8 @@ allocaBytes (I# size) action = IO $ \ s0 ->
      case touch# barr# s3 of { s4 ->
      (# s4, r #)
   }}}}}
+-- See Note [NOINLINE for touch#]
+{-# NOINLINE allocaBytes #-}
 
 allocaBytesAligned :: Int -> Int -> (Ptr a -> IO b) -> IO b
 allocaBytesAligned (I# size) (I# align) action = IO $ \ s0 ->
@@ -145,6 +160,8 @@ allocaBytesAligned (I# size) (I# align) action = IO $ \ s0 ->
      case touch# barr# s3 of { s4 ->
      (# s4, r #)
   }}}}}
+-- See Note [NOINLINE for touch#]
+{-# NOINLINE allocaBytesAligned #-}
 
 -- |Resize a memory area that was allocated with 'malloc' or 'mallocBytes'
 -- to the size needed to store values of type @b at .  The returned pointer
diff --git a/testsuite/tests/perf/should_run/all.T b/testsuite/tests/perf/should_run/all.T
index 7a52492..9705a08 100644
--- a/testsuite/tests/perf/should_run/all.T
+++ b/testsuite/tests/perf/should_run/all.T
@@ -466,11 +466,12 @@ test('T9203',
                       # 2016-04-06     84345136 (i386/Debian) not sure
                       # 2017-03-24     77969268 (x86/Linux, 64-bit machine) probably join points
 
-                      , (wordsize(64), 84620888, 5) ]),
+                      , (wordsize(64), 98360576, 5) ]),
                       # was            95747304
                       # 2019-09-10     94547280 post-AMP cleanup
                       # 2015-10-28     95451192 emit Typeable at definition site
                       # 2016-12-19     84620888 Join points
+                      # 2018-07-30     98360576 it's unclear
       only_ways(['normal'])],
      compile_and_run,
      ['-O2'])



More information about the ghc-commits mailing list