How to ensure optimization for large immutable vectors to be shared w.r.t. Referential Transparency

Tue Apr 6 16:27:27 UTC 2021

On Tue, Apr 06, 2021 at 11:10:51AM -0400, Viktor Dukhovni wrote:

> > λ> let v = VS.fromList [3,2,5] in isSameVector (SomeVector v) (SomeVector v)
> 
> One thing I'm not sure about, that perhaps someone else can shed light
> on, is whether with optimisation one might expect the two (SomeVector v)
> values to be subject to CSE, given that they both invoke `v` at the same
> type.  Is there a non-default optimisation flag that makes CSE more
> aggressive that would make that happen?

On a hunch I tried suppressing the inlining of the definition of `v`,
and CSE then kicked in...

    {-# LANGUAGE BangPatterns #-}
    {-# LANGUAGE NoMonomorphismRestriction #-}

    import Control.Monad.ST
    import Data.Coerce
    import qualified Data.Vector.Storable as VS

    newtype SomeVector = SomeVector (VS.Vector Int)                                     

    isSameVector :: SomeVector -> SomeVector -> Bool
    isSameVector !(SomeVector x) !(SomeVector y) = runST $ do
      mx@(VS.MVector !x'offset !x'fp) <- VS.unsafeThaw x
      my@(VS.MVector !y'offset !y'fp) <- VS.unsafeThaw y
      _ <- VS.unsafeFreeze mx
      _ <- VS.unsafeFreeze my
      return $ x'offset == y'offset && x'fp == y'fp

    makev = VS.fromList [0..1023]
    {-# NOINLINE makev #-}

    main :: IO ()
    main = 
        let !v = makev
         in print $ isSameVector (SomeVector v) (SomeVector v)

So it appears that inlining of `v` into (SomeVector v) is the proximate
barrier to identifying the two (SomeVector v) terms.  Is this expected?

-- 
    Viktor.