Touching unlifted values

Carter Schonwald carter.schonwald at gmail.com
Fri Sep 21 00:28:56 UTC 2018


Hey Andrew,
theres definitely optimizations in ghc that (roughly? i'm not the best
expert) unwrap / optimize away single constructor data types in certain
cases (haha, cases),

I forget the name of the specific optimization, but its a pretty well
documented one in ghc

I think its the CPR analysis? I could be wrong
https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/Demand  (i could
be wrong though)

either way, i seem to recall you'll be at ICFP next week, so thats def a
venue i or someone else can help you sleuth it at

On Thu, Sep 20, 2018 at 8:50 AM Andrew Martin <andrew.thaddeus at gmail.com>
wrote:

> The touch# primitive accepts a levity-polymorphic argument. I am wondering
> if there is ever any difference between using it on a lifted value and an
> unlifted value. Consider the following:
>
>     module Lifted where
>
>     import Control.Monad.ST (runST)
>     import Control.Monad.Primitive (touch)
>     import Data.Int (Int64)
>     import Data.Primitive
> (newPinnedByteArray,mutableByteArrayContents,readOffAddr,writeOffAddr)
>
>     computation :: Int64
>     computation = runST $ do
>       arr <- newPinnedByteArray 8
>       let addr = mutableByteArrayContents arr
>       writeOffAddr addr 0 (42 :: Int64)
>       i <- readOffAddr addr 0
>       touch arr
>       return i
>
> Calling touch on the mutable byte array is necessary to make sure that the
> memory that the Addr points doesn't get GCed while we are writing and
> reading to and from it. Here is the relevant GHC core (compiled with -O2):
>
>     -- RHS size: {terms: 32, types: 67, coercions: 19, joins: 0/1}
>     computation1
>     computation1
>       = \ s1_a24R ->
>           case newPinnedByteArray# 8# (s1_a24R `cast` <Co:4>) of
>           { (# ipv_a246, ipv1_a247 #) ->
>           let {
>             addr_s273
>             addr_s273 = byteArrayContents# (ipv1_a247 `cast` <Co:9>) } in
>           case writeInt64OffAddr# addr_s273 0# 42# ipv_a246 of s'#_a24p
>           { __DEFAULT ->
>           case readInt64OffAddr# addr_s273 0# s'#_a24p of
>           { (# ipv2_a267, ipv3_a268 #) ->
>           case touch#
>                  ((MutableByteArray ipv1_a247) `cast` <Co:3>)
>                  (ipv2_a267 `cast` <Co:3>)
>           of s'_a24K
>           { __DEFAULT ->
>           (# s'_a24K, I64# ipv3_a268 #)
>           }
>           }
>           }
>           }
>
>     -- RHS size: {terms: 5, types: 30, coercions: 0, joins: 0/0}
>     computation
>     computation
>       = case runRW# computation1 of { (# ipv_a239, ipv1_a23a #) ->
>         ipv1_a23a
>         }
>
> Instead, what if we touched the underlying unlifted MutableByteArray#?
> Here is the code for doing this:
>
>     {-# language MagicHash #-}
>     {-# language UnboxedTuples #-}
>
>     module Unlifted
>       ( computation
>       ) where
>
>     import System.IO.Unsafe (unsafeDupablePerformIO)
>     import Control.Monad.Primitive
> (unsafePrimToPrim,primitive,PrimState,PrimMonad)
>     import Data.Int (Int64)
>     import Data.Primitive
> (MutableByteArray(..),newPinnedByteArray,mutableByteArrayContents,readOffAddr,writeOffAddr)
>     import GHC.Exts (touch#,MutableByteArray#)
>
>     computation :: Int64
>     computation = unsafeDupablePerformIO $ do
>       arr@(MutableByteArray arr#) <- newPinnedByteArray 8
>       let addr = mutableByteArrayContents arr
>       writeOffAddr addr 0 (42 :: Int64)
>       i <- readOffAddr addr 0
>       touchUnlifted arr#
>       return i
>
>     touchUnlifted :: PrimMonad m => MutableByteArray# (PrimState m) -> m ()
>     touchUnlifted x = unsafePrimToPrim
>       $ (primitive (\s -> case touch# x s of { s' -> (# s', () #) }) :: IO
> ())
>
> GHC generates the following core for this module (again, omitting
> irrelevant parts):
>
>     computation1
>     computation1
>       = \ s_a49h ->
>           case newPinnedByteArray# 8# (s_a49h `cast` <Co:3>) of
>           { (# ipv_a48a, ipv1_a48b #) ->
>           let {
>             addr_s4aY
>             addr_s4aY = byteArrayContents# (ipv1_a48b `cast` <Co:8>) } in
>           case writeInt64OffAddr# addr_s4aY 0# 42# ipv_a48a of s'#_a48A
>           { __DEFAULT ->
>           case readInt64OffAddr# addr_s4aY 0# s'#_a48A of
>           { (# ipv2_a4aq, ipv3_a4ar #) ->
>           case touch# ipv1_a48b (ipv2_a4aq `cast` <Co:2>) of s'_a2xn
>           { __DEFAULT ->
>           (# s'_a2xn, I64# ipv3_a4ar #)
>           }
>           }
>           }
>           }
>
>     -- RHS size: {terms: 5, types: 30, coercions: 0, joins: 0/0}
>     computation
>     computation
>       = case runRW# computation1 of { (# ipv_a47X, ipv1_a47Y #) ->
>         ipv1_a47Y
>         }
>
> I feel confident that both of these are semantically equivalent. Both uses
> of touch# should keep the MutableByteArray# alive until we are done using
> the pointer we extracted from it. What I'm less sure about is whether or
> not the first one actually does an alloctation for the MutableByteArray
> data constructor when it calls touch. Is this eliminated in some other
> stage of compilation?
>
> --
> -Andrew Thaddeus Martin
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20180920/ae75c389/attachment.html>


More information about the Libraries mailing list